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PRELIMINARY AMENDMENT 

Box PCT 

Assistant Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

In connection with the above-identified application filed herewith, please enter the 
following preliminary amendment. 

IN THE ABSTRACT 

Insert the attached Abstract page into the application as the last page thereof. 

IN THE SPECIFICATION 

A courtesy copy of the present specification is enclosed herewith. However, the 
World Intellectual Property Office (WIPO) copy should be relied upon if it is already in the U.S. 
Patent Office. 



IN THE CLAIMS 

Please amend the following claims as indicated below. A marked-up copy of all 
claims is attached for reference. 

4. (Amended) A secondary or three-dimensional structure of a glycosyltransferase as 
defined in claim 1 that is a crystalline form. 

5. (Amended) A secondary or three-dimensional structure of a glycosyltransferase as 
defined in claim 1, wherein the glycosyltransferase is an N- 
acetylglucosaminyltransferase. 

6. (Amended) A secondary or three-dimensional structure of a glycosyltransferase as 
defined in claim 1 having one or both of the following characteristics: 

(a) a N-terminal domain comprising an eight-stranded mixed P-sheet flanked by 
six helices, and a small two-stranded antiparallel P-sheet ; and 

(b) a C-terminal domain comprising a four-stranded mixed p-sheet flanked by 
three a-helices and a short P-finger. 

8, (Amended) A secondary or three-dimensional structure of a glycosyltransferase as 
defined in claim 1 having the structural coordinates of a glycosyltransferase listed in 
Table 1,2,3, or 4. 

13. (Amended) A crystalline form as claimed in claim 1 1 further characterized by the 

parameters, diffraction statistics, and/or refinement statistics in Table 6. 

14. (Amended) A secondary or three-dimensional structure of a binding site of a 

secondary or three-dimensional structure of a glycosyltransferase as defined in claim 
1. 
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16. (Amended) A secondary or three-dimensional structure of a binding site of a 
glycosyltransferase as defined in claim 1 wherein the binding site is also defined by 
the atomic interactions of Table 5, preferably the enzyme atomic contacts. 

17. (Amended) A secondary or three-dimensional structure of a binding site of a 
glycosyltransferase as defined in claim 1 wherein the binding site is defined by 
atomic interactions 1 to 5; 6 and 7; 8, 9 and 10; 1 to 13; 14 to 21; 22 to 27; 1 to 13; 1 
to 21; or 11, 12, 13, and 27 listed in Table 5, or the enzyme atomic contacts for these 
atomic interactions Hsted in Table 5. 

18. (Amended) A secondary or three-dimensional structure of an spsA GnT 1 core 

(SGC) domain of a secondary or three-dimensional structure of a 
glycosyhransferase as defined in claim 1 . 
20. (Amended) A modulator of the activity of a glycosyltransferase derived from a 

secondary or three-dimensional structure as claimed in claim 1 . 
23. (Amended) A method for identifying a modulator of a glycosyltransferase by 
determining binding interactions between a test compound and secondary or three- 
dimensional structures of binding sites as defined in claim 1 comprising: 

(a) generating the binding sites on a computer screen; 

(b) generating a test compound with its spatial structure on the computer screen; 
and 

(c) testing to determine whether the test compound binds to a selected number of 
binding sites. 

24. (Amended) A method for identifying a potential modulator of a glycosyltransferase 
fiinction comprising the steps: 
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(a) docking a computer representation of a compound from a computer data base 
with a computer representation of a secondary or three-dimensional structure 
of a glycosyhransferase or a binding site as defined claim 1, to obtain a 
complex; 

(b) determining a conformation of the complex with a favourable geometric fit 
and favourable complementary interactions; and 

identifying compounds that best fit the selected site as potential modulators of 
the glycosyhransferase. 

25. (Amended) A method for identifying a potential modulator of a glycosyhransferase 
function comprising the steps: 

(a) modifying a computer representation of a compound complexed with a 
secondary or three-dimensional structure of a glycosyhransferase or a binding site 
as defined in claim 1, by deleting or adding a chemical group or groups; 

(b) determining a conformation of the complex with a favourable geometric fit 
and favourable complementary interactions; and 

(c) identifying a compound that best fits the binding cavity as a potential 
modulator of a glycosyhransferase. 

26. (Amended) A method for identifying a potential modulator of a glycosyhransferase 
function comprising the steps: 

(a) selecting a computer representation of a compound complexed with a 
secondary or three-dimensional structure of a glycosyhransferase or a binding site 
as defined in claim 1; and 

(b) searching for molecules in a data base that are similar to the compound using a 

searching computer program, or replacing portions of the compound with similar 
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chemical structures from a data base using a compound building computer 
program. 

27. (Amended) A modulator of a glycosyltransferase identified by a method as claimed 
in in claim 1 . 

28. (Amended) A method for designing potential inhibitors of a glycosyltransferase 
comprising the step of using the structural coordinates of a sugar nucleotide donor or 
acceptor or component thereof, defined in relation to it spatial association with the 
three dimensional structure of a glycosyltransferase or a binding site as defined in 
claim 1 , to generate a compound that is capable of associating with the 
glycosyltransferase or binding cavity thereof. 

29. (Amended) A modulator of a glycosyltransferase based on a three-dimensional 

structure of a sugar nucleotide donor, an acceptor, or a component thereof, defined in 
relation to the sugar nucleotide donor's or acceptor's spatial association with a 
secondary or three-dimensional structure of a glycosyltransferase or binding site as 
defined in claim 1. 

30. (Amended) A pharmaceutical composition comprising a modulator as claimed in 

claim 1 either alone or with other active substances. 
32. (Amended) Use of a modulator identified by the methods of claim 1 in the preparation 

of a medicament to treat a disease associated with a glycosyltransferase with 

inappropriate activity in a cellular organism. 
34. (Amended) Machine readable media encoded with data representing the structural 

coordinates of a secondary or three-dimensional structure of a glycosyltransferase or a 

binding site as defined in claim 1 . 



5 



REMARKS 

A new abstract page is supplied to conform to that appearing on the pubhcation 
page of the WIPO apphcation, but the new Abstract is typed on a separate page as required by 
U.S. practice. 

Apphcants respectfully request that the preliminary amendment described herein 
be entered into the record prior to calculation of the filing fee and prior to examination and 
consideration of the above-identified application. 

If a telephone conference would be helpful in resolving any issues concerning this 
communication, please contact Applicants' primary attomey-of record, Douglas P. Mueller (Reg. 
No. 30,300), at (612) 371.5237. 

Respectfully submitted, 

MERCHANT & GOULD P.C. 
P.O. Box 2903 

Minneapolis, Minnesota 55402-0903 
(612) 332-5300 



Dated: December 18, 2001 




Reg. No. 30,300 
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MARKED-UP COPY OF CLAIMS 

1 . A secondary or three-dimensional structure of a purified glycosyitransferase when it 
associates with a nucleotide sugar donor, acceptor, or metal co factor. 

2. A secondary or three-dimensional structure of a purified glycosyitransferase in 
association with a moiety. 

3. A secondary or three-dimensional structure as claimed in claim 2, wherein the moiety 
is a nucleotide sugar donor, acceptor, metal cofactor, or heavy metal atom. 

4. A secondary or three-dimensional structure of a glycosyitransferase as defined in [any 
of the preceding] claim[s] 1_ that is a crystalline form. 

5. A secondary or three-dimensional structure of a glycosyitransferase as defined in [any 
of the preceding] claim[s] 1^, wherein the glycosyitransferase is an N- 
acetylglucosaminyltransferase. 

6. A secondary or three-dimensional structure of a glycosyitransferase as defined in [any 
of the preceding] claim[s] I having one or both of the following characteristics: 

(b) an N-terminal domain comprising an eight-stranded mixed P-sheet flanked by 
six helices, and a small two-stranded antiparallel p-sheet ; and 

(c) a C-terminal domain comprising a four-stranded mixed p-sheet flanked by 
three a-helices and a short p-finger. 

7. A secondary or three-dimensional structure of a glycosyitransferase as defined in 
claim 6 further characterized by the N-terminal domain and C-terminal domain being 
connected by a linker region, which wraps halfway around the N-terminal domain 
before starting the first helix of the C-terminal domain. 
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8. A secondary or three-dimensional structure of a glycosyltransferase as defined in [any 
of the preceding] claim[s] I having the structural coordinates of a glycosyltransferase 
listed in Table 1, 2, 3, or 4. 

9. A secondary or three-dimensional structure of a glycosyltransferase in association 
with a sugar nucleotide donor having the structural coordinates of a 
glycosyltransferase and a sugar nucleotide donor listed in Table 3. 

10. A secondary or three-dimensional structure of a glycosyltransferase in association 
with an acceptor having the structural coordinates of a glycosyltransferase and an 
acceptor listed in Table 4. 

1 1 . A crystalline form of a glycosyltransferase having a unit cell with dimensions of a = 
40.4 ± 3 A, b=82.4 ± 3 A, and c = 102.5 ± 3 A. 

12. A crystalline form of an N-acetylglucosaminyltransferase having the structural 
coordinates listed in Table 1, 2, 3, or 4, and a unit cell with dimensions of a = 40.4 ± 
3 A, b=82.4 ± 3 A, and c = 102.5 ± 3 A. 

13. A crystalline form as claimed in claim 1 1 [or 12] further characterized by the 
parameters, diffraction statistics, and/or refinement statistics in Table 6. 

14. A secondary or three-dimensional structure of a binding site of a secondary or three- 
dimensional structure of a glycosyltransferase as defined in [any of the preceding] 
claim[s] 1_. 

15. A secondary or three-dimensional structure of a binding site as claimed in claim 14 
wherein the binding site is defined by its association with one or more of a 
disphosphate group of a sugar nucleotide donor, a nucleotide of a sugar nucleotide 
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donor, a sugar of a nucleotide of a sugar nucleotide donor, a selected sugar of a sugar 
nucleotide donor that is transferred to an acceptor, and/or an acceptor. 

16. A secondary or three-dimensional structure of a binding site of a glycosyltransferase 
as defined in [the preceding] claim[s] I wherein the binding site is also defined by the 
atomic interactions of Table 5, preferably the enzyme atomic contacts. 

17. A secondary or three-dimensional structure of a binding site of a glycosyltransferase 
as defined in [the preceding] claim[s] I wherein the binding site is defined by atomic 
interactions 1 to 5; 6 and 7; 8, 9 and 10; 1 to 13; 14 to 21; 22 to 27; 1 to 13; 1 to 21; 
or 1 1, 12, 13, and 27 listed in Table 5, or the enzyme atomic contacts for these atomic 
interactions listed in Table 5. 

18. A secondary or three-dimensional structure of an spsA GnT 1 core (SGC) domain of a 
secondary or three-dimensional structure of a glycosyltransferase as defined in [any of 
the preceding] claim[s] I, 

19. A secondary or three-dimensional structure of an SGC domain as claimed in claim 18 
characterized by an eight-stranded mixed p-sheet, flanked by six helices, and a small 
two-stranded antiparallel p-sheet. 

20. A modulator of the activity of a glycosyltransferase derived fi*om a secondary or 
three-dimensional structure as claimed in [any of the preceding] claim [s] 1_. 

21. A method of determining three-dimensional structures of polypeptides with unknown 
structure comprising the step of applying the structural coordinates of Table 1, 2, 3, or 
4. 
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22. A method for identifying a potential modulator of a glycosyltransferase, or binding 
sites or domains thereof, comprising the step of using the structural coordinates of 
Table 1, 2, 3, or 4 that define a glycosyltransferase or binding sites or domains 
thereof, to computationally evaluate a test compound for its ability to associate with 
the glycosyltransferase, binding sites or domains thereof, wherein a test compound 
that associates is a potential modulator of a glycosyltransferase. 

23. A method for identifying a modulator of a glycosyltransferase by determining binding 
interactions between a test compound and secondary or three-dimensional structures 
of binding sites as defined in [any of the preceding] claim[s] I comprising: 

(a) generating the binding sites on a computer screen; 

(b) generating a test compound with its spatial structure on the computer screen; 
and 

(c) testing to determine whether the test compound binds to a selected number of 
binding sites. 

24. A method for identifying a potential modulator of a glycosyltransferase function 
comprising the steps: 

(c) docking a computer representation of a compound fi-om a computer data base 
with a computer representation of a secondary or three-dimensional structure 
of a glycosyltransferase or a binding site as defined in [any of the preceding] 
claim[s] ]_, to obtain a complex; 

(d) determining a conformation of the complex with a favourable geometric fit 
and favourable complementary interactions; and 
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(e) identifying compounds that best fit the selected site as potential modulators of 
the glycosyltransferase. 

25. A method for identifying a potential modulator of a glycosyltransferase function 
comprising the steps: 

(d) modifying a computer representation of a compound complexed with a 
secondary or three-dimensional structure of a glycosyltransferase or a binding 
site as defined in [any of the preceding] claim[s] 1_, by deleting or adding a 
chemical group or groups; 

(e) determining a conformation of the complex with a favourable geometric fit 
and favourable complementary interactions; and 

(f) identifying a compound that best fits the binding cavity as a potential 
modulator of a glycosyltransferase. 

26. A method for identifying a potential modulator of a glycosyltransferase function 
comprising the steps: 

(a) selecting a computer representation of a compound complexed with a 
secondary or three-dimensional structure of a glycosyltransferase or a binding 
site as defined in [any of the preceding] claim[s] I; and 

(b) searching for molecules in a data base that are similar to the compound using a 
searching computer program, or replacing portions of the compound with 
similar chemical structures from a data base using a compound building 
computer program. 

27. A modulator of a glycosyltransferase identified by a method as claimed in [any of the 
preceding] claim[s] 1_. 
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28. A method for designing potential inhibitors of a glycosyltransferase comprising the 
step of using the structural coordinates of a sugar nucleotide donor or acceptor or 
component thereof, defined in relation to it spatial association with the three 
dimensional structure of a glycosyltransferase or a binding site as defined in [any of 
the preceding] claim[s] 1, to generate a compound that is capable of associating with 
the glycosyltransferase or binding cavity thereof. 

29. A modulator of a glycosyltransferase based on a three-dimensional structure of a 
sugar nucleotide donor, an acceptor, or a component thereof, defined in relation to the 
sugar nucleotide donor's or acceptor's spatial association with a secondary or three- 
dimensional structure of a glycosyltransferase or binding site as defined in [the 
preceding] claim[s] I. 

30. A pharmaceutical composition comprising a modulator as claimed in [any of the 
preceding] claim[s] J_ either alone or with other active substances. 

31. A method of treating a disease associated with a glycosyltransferase with 
inappropriate activity in a cellular organism, comprising: 

(a) administering a pharmaceutical composition as claimed in claim 30; and 

(b) activating or inhibiting a glycosyltransferase to treat the disease. 

32. Use of a modulator identified by the methods of [any of the preceding] claim[s] I in 
the preparation of a medicament to treat a disease associated with a 
glycosyltransferase with inappropriate activity in a cellular organism. 

33. Use of structural coordinates of a glycosyltransferase structure as set out in Table 1, 2, 
3, or 4 to manufacture a medicament. 

34. Machine readable media encoded with data representing the structural coordinates of 
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a secondary or three-dimensional structure of a glycosyltransferase or a binding site 
as defined in [any of the preceding] claim[s] I, 
35. A machine readable media as claimed in claim 34 wherein the data also includes 

structural coordinates for a nucleotide sugar donor, acceptor, metal cofactor, or heavy 
metal atom. 
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-1- JCISRec'dPCT/PTO 18 DEC 2001 

TITLE: Giycosyltransferases Structures 
FIELD or THE nwENTION 

The invention relates to the secondary and three dimensional structures of giycosyltransferases. The 
atomic coordinates that define the structure and any confounds bound to the stmcture may be used to 
determine glycosyltransferasc homologues and the stnicmrcs of polypeptides with unknown structure, and to 
identify modulators of giycosyltransferases. 
B ArKCROTTND OF THE TNVENTTQN 

The oligosaccharide chains of N- and O-linked glycoproteins play a crucial role in a number of 
biological processes. Their biosynthesis and degradation pathways are therefore areas of significant interest 
for biology, medicine, and biotechnology. The assembly of the various types of oligosaccharides involves 
several glycosidases and giycosyltransferases. In comparison with glycosidases, the mechanisms of which 
have been characterized in some detail, mechanistic investigations on giycosyltransferases have not yet 
imdergone much scrutiny, although some kinetic studies have been reported. 

Glycosyln^nsferases are a diverse group of enzymes that catalyze the transfer of a single 
monosaccharide unit from a donor to the hydroxyl group of an acceptor saccharide. The acceptor can be cither 
a fi-ee saccharide, glycoprotein, glycolipid, or polysaccharide. The donor can be a nucleotide-sugar, or 
dolichol-phosphate-sugar. Glycosyltransfciases show a precise specificity for both the sugar acceptor and 
donor, and generally require the presence of a metal cofactor. 
SUMMARY OF THE INVENTION 

Broadly stated, the present invention relates to the secondary and three-dimensional stractures of 
giycosyltransferases, and parts thereof The glycosyltransferasc structure may be the structure the enzyme 
takes up when it is associated with one or more moieties (e.g. an acceptor, a sugar nucleotide donor, or 
components thereof). The invention also contemplates a glycosyltransferasc stmcmre comprising a secondary 
or three-dimensional strucmre of a glycosyltransferasc in association with a moiety. The defmed boundaries 
and properties of the strucmres and any of the moieties bound to it are pertinent to methods for determining 
the secondary or three-dimensional structures of polypeptides with unknown structure, and to methods that 
identify modulators of glycosylttansferases. These modulators are potentially useful as therapeutics for 
diseases, including (but not limited to) tumor growth, metastasis of tumors, bacterial, viral, and parasitic 
infections, and inflannnatory diseases such as rheiunatoid arthritis, asthma, inflammatory bowel disease, and 
atherosclerosis. 

In an embodiment, the invention provides a crystalline form of a polypeptide corresponding to a 
glycosyltransferasc, or a part thereof The invention preferably contemplates a crystalhne form a 
glycosyltransferasc takes up when it is complexed with a moiety, including a nucleotide sugar donor, 
acceptor, metal cofactor, or heavy metal atom. The crystalline form may also comprise one or more heavy 
metal atoms, or at least one compound. A unit cell of the crystalline form of the invention may have 
dimensions of about a = 40.4 ± 3.0 A, b=82.4 ± 3.0 A, c =^ 102.5 ± 3.0 A. 

A glycosyltransferasc structure of the invention may also be characterized by one or more of the 
following: 
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(a) an N-terminal domain (amino acid residues 106-317 in Table 3) comprising an eight-stranded 
mixed P-sheet (pl-ps in Figure 25) flanked by six helices (al-a6 in Figure 25) and a small two- 
stranded antiparallel p-sheet (p4' and p8' in Figure 25); snd 

(b) a C-terminal domain (amino acid residues 354-447 in Table 3) comprising a four-stranded mixed 
5 p-sheet (p9, piO, pi3, and pi4 in Figure 25) flanked by three a-helices (a7-a9 in Figure 25) 

and a short P-finger (P 11 and P 1 2 in Figure 25). 
The N-tenninal domain and C-terminal domain may be connected by a linker region (residues 33 1 to 
353 in Table 3) which wraps halfway around the N-terminal domain before starting the first helix of the C- 
terminal domain. 

10 The crystalline form may also be specifically characterized by the parameters, diflraction statistics 

and/or refinement statistics set out in Table 6. 

The invention also contemplates a secondary or three-dimensional stmcture (e.g. a crystalline form) of 
a domain of a glycosyltransferase. In accordance with one aspect, the invention contemplates a secondary or 
three-dimensional structure of a domain comprising an eight-stranded mixed P-sheet, flanked by six helices 

15 and a small two-stranded antiparallel P-sheet. The domain is also referred to herein as the "spsA GnT 1 core 
domain" or "SGC domain". In accordance with a preferred embodiment, the invention contemplates a domain 
comprising an eight-stranded mixed p-sheet represented as pi-p8 in Figure 25, flanked by six helices 
represented by al-a6 in Figure 25, and a small two-stranded antiparallel p-sheet represented by P4* and pS* 
in Figure 25. A secondary or three-dimensional structure of a polypeptide comprising an SGC domain of the 

20 invention is also within the scope of the invention. 

The invention further contemplates a loop structure of a glycosyltransferase. A loop structure may be 
characterized as the structure adjacent to the nucleotide-sugar donor binding site comprising amino acid 
residues 318-330 in Table 3. The loop structure may be further characterized by amino acid residues 320-323 
fonning a type IV turn and amino acid residues 324-330 making one complete turn of an a-helix. A secondary 

25 or three dimensional structure of a polypeptide comprising a loop structure of the invention is also within the 
scope of the invention. 

The invention also relates to a method of forming a crystalline form of the invention. 
The invention also features a method of determining secondary or three-dimensional structures of 
polypeptides with unknown structure comprising the step of applying the structural atomic coordinates of a 

30 crystalline form of a glycosyltransferase of the invention. 

The invention also provides a secondary or three-dimensional structure of a binding site of a 
glycosyltransferase. Binding sites include the binding sites for one or more of a disphosphate or pyrophosphate 
group of a sugar nucleotide donor, a nucleotide of a sugar nucleotide donor, a nitrogeneous heterocyclic base 
(preferably a pyrimidine base, more preferably uracil) of a sugar nucleotide donor, a sugar of the nucleotide of 

35 a sugar nucleotide donor, a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, and/or 
an acceptor. The secondary or three-dimensional structure of a binding site may be defined by selected atomic 
contacts in the site. Thus, broadly stated the present invention provides a secondary or three-dimensional 
structure of a binding site of a glycosyltransferase defined by one or more atomic interactions or enzyme 
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atomic contacts as set forth in Table 5. Each of the atomic interactions is defined in Table 5 by an atomic 
contact (more preferably, a specific atom where indicated) on the sugar nucleotide donor or acceptor, and an 
atomic contact (more preferably a specific atom where indicated) on the glycosyltransferase. 

The invention also relates to modulators derived from a secondary or three-dimensional structure of a 
5 glycosyltransferase, binding sites, atomic interactions, or atomic contacts thereof, or a domain of a secondary 
or three-dimensional structure of a glycosyltransferase, including a SGC domain. Preferably, the modulators 
are derived fi-om binding sites for a sugar nucleotide donor or parts thereof, an acceptor or parts thereof, 
including the SGC domain, and the binding site described herein as the loop structure. The invention provides 
inhibitors that are derived fi*om a DxD motif, for example, peptides having the sequences as shown in Figures 

10 27 and 3 1 (SEQ ID NOs 1-9). 

The present invention also contemplates a method of identifying a modulator of a glycosyltransferase. 
a binding site or a domain thereof, comprising the step of using the structural coordinates of a 
glycosyltransferase, binding sites, atomic interactions, or atomic contacts thereof, or domain thereof, to 
computationally evaluate a test compound for its ability to associate with the glycosyltransferase, binding site, 

15 or domain thereof. Use of the structural coordinates of a glycosyltransferase structure, binding sites, atomic 
interactions, or atomic contacts of the invention to identify a modulator is also provided. 

In an embodiment of the invention, a method is provided for identifying a modulator of a 
glycosyltransferase by determining binding interactions between a test compound and a binding site of a 
glycosyltransferase, or atomic interactions, or atomic contacts thereof, or a domain of a glycosyltransferase 

20 defined in accordance with the invention comprising: 

(a) generating the binding site, atomic interactions, atomic contacts, or domain on a 
computer screen; 

(b) generating a test compound with its spatial structure on the computer screen; and 

(c) testing to determine whether the test compound binds to the binding site, a selected 
25 number of atomic contacts, or the domain. 

Methods are also provided for identifying a potential modulator of a glycosyltransferase function by 
docking a computer representation of a compound with a computer representation of a structure of a 
glycosyltransferase or a part thereof, that is defined by the atomic structural coordinates, atomic interactions, or 
atomic contacts described herein. 
30 In an embodiment the method comprises the following steps: 

(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected site (e.g. the sugar nucleotide donor or acceptor 
binding site, loop structure, or SGC domain) on a glycosyltransferase defined in accordance 
with the invention, to obtain a complex; 
35 (b) determining a conformation of the complex with a favourable geometric fit and favourable 

complementary interactions; and 
(c) identifying compounds that best fit the selected site as potential modulators of the 
glycosyltransferase. 
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In another embodiment the method comprises the following steps: 

(a) modifying a computer representation of a compound complexed with a selected site (e.g. 
sugar nucleotide donor or acceptor binding site, loop structure, or SGC domain) on a 
glycosyltransferase defined in accordance with the invention, by deleting or adding a 

5 chemical group or groups; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying a compound that best fits the selected site as a potential modulator of a 
glycosyltransferase. 

10 In still another embodiment the method comprises the following steps: 

(a) selecting a computer representation of a compound complexed with a selected site (e.g. sugar 
nucleotide donor or acceptor binding site, loop structure, or SGC domain) on a 
glycosyltransferase defined in accordance with the invention; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
1 5 computer program, or replacing portions of the compound with similar chemical structures 

from a data base using a compound building computer program. 
A compound that interacts with a glycosyltransferase, binding sites or atomic contacts thereof, or a 
domain thereof, identified using a method of the invention may be used as a modulator of any 
glycosyltransferase or composition bearing the interacting binding site, atomic contacts, or domain. Therefore, 
20 the invention features a modulator of a glycosyltransferase identified by a method of the invention. 

The invention further contemplates classes of modulators of glycosyltransferases based on the three- 
dimensional structure of a sugar nucleotide donor, or component thereof, or acceptor, defined in relation to the 
sugar nucleotide donor's or acceptor's spatial association with a glycosyltransferase structure. Generally, a 
method is provided for designing potential inhibitors of a glycosyltransferase comprising the step of using the 
25 structural coordinates of a sugar nucleotide donor or acceptor or component thereof, defined in relation to it 
spatial association with the glycosyltransferase structure or a binding site thereof, to generate a compound that 
is capable of associating with the glycosyltransferase or binding site thereof. 

It will be appreciated that a modulator of a glycosyltransferase may be identified by generating an 
actual secondary or three-dimensional models of a binding site, synthesizing a compound, and examining the 
30 components to find whether the required interaction occurs. 

A potential modulator of a glycosyltransferase identified by a method of the present invention may be 
confirmed as a modulator by synthesizing the compound, cind testing its effect on the glycosyltransferase in an 
assay for that glycosyltransferase 's enzymatic activity. Such assays are known in the art. 

A modulator of the invention may be converted using customary methods into pharmaceutical 
35 compositions. A modulator may be formulated into a pharmaceutical composition containing a modulator 
either alone or together with other active substances. 

Therefore, the methods of the invention for identifying modulators may comprise one or more of the 
following additional steps: 
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(a) testing whether the modulator is a modulator of the activity of a glycosyltransferase, preferably 
testing the activity of the modulator in cellular assays and animal model assays; 

(b) modifying the modulator; 

(c) optionally rerunning steps (a) or (b); and 

(d) preparing a pharmaceutical composition comprising the modulator. 

Steps (a), (b) (c) and (d) may be carried out in any order, at different points in time, and they need not be 
sequential. 

The invention also contemplates a method of treating a disease associated with a glycosyltransferase 
with inappropriate activity in a cellular organism, comprising: 

(a) administering a modulator of the invention in an acceptable pharmaceutical preparation; and 

(b) activating or inhibiting a glycosyltransferase to treat the disease. 

The invention provides for the use of a modulator identified by the methods of the invention in the 
preparation of a medicament to treat a disease associated with a glycosyltransferase with inappropriate activity 
in a cellular organism. Use of the structural coordinates of a glycosyltransferase structure of the invention to 
manufacture a medicament is also provided. 

Another aspect of the invention provides machine readable media encoded with data representing the 
coordinates of the secondary or three dimensional structure of a glycosyltransferase, binding sites or atomic 
contacts thereof, or domain as defined herein, or the three dimensional structure of a sugar nucleotide donor or 
acceptor defined in relation to its spatial association with a glycosyltransferase structure as defined herein. The 
invention also provides computerized representations of the secondary or three-dimensional structures of the 
invention, including any electronic, magnetic, or electromagnetic storage forms of the data needed to define the 
structures such that the data will be computer readable for purposes of display and/or manipulation. 

These and other aspects of the present invention will become evident upon reference to the following 
detailed description and attached drawings. 
DESCRIPTION OF THE DRAWINGS 

The invention will now be described in relation to the drawings in which: 

Figure 1 is a secondary structure diagram of GnT-1, as viewed along the beta sheet, fi-om strand "b3." 
Note the eight-sn-anded beta sheet twist in the foreground, and the four-stranded beta sheet, offset in the 
background. 

Figure 2 is a secondary strucnire diagram of GnT-1, showing a view fi-om the side. The first domain 
is a mixed eight-stranded beta sheet, backed by alpha helices, indicated by "b" for the beta sn^ds and "a" for 
the alpha helices. The second domain is a mixed four-stranded beta sheet, again backed by helices, and 
indicated with capital "B" and "A," respectively. 

Figure 3 is a sample of experimental MAD MeHg-derivative GnT-1 density, fi-om the bottom of the 
active site pocket. The Hg position was identified using SOLVE. SHARP was used to refine the Hg 
parameters, and CCP4 dm was used for solvent-flattening and histogram matching, giving the shown map. 

Figure 4 is a hydrophobic surface diagram of the top and bottom of GnT-1, with hydrophobic regions 
in green. Note the patch in the pocket, as well as at the base of the alpha-helix "tower." 
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Figure 5 is an electrostatic surface diagram of the top and bottom of GnT-1, with acidic regions in 
red, and basic regions in blue. Note the large acidic patch to one side of the active site pocket. 

Figure 6 is a conservation diagram of the active site pocket of GnT-1 . Conserved regions are 
indicated in red, with "A" being fully conserved, and "0" unconserved. Alignments of active GnT-l's (rabbit, 
5 human, rat, mouse, golden hamster, Chinese hamster, C elegans 1 #1 and /^3, and frog) was performed using 
CLUSTALX, and conservation was calculated using AMAS. Note the highly conserved active site pocket. 

Figure 7 is a worm diagram of the GnT-I structure with secondary structure shown. Beta strands are 
shown as arrows, and alpha helices are helices. UDP-GIcNAc and the Mn^*^ ion are shown in the binding site. 

Figure 8A through 8F are surface diagrams of the GnT-I structure.UDP-GlcNAc and the Mn^* ion are 
10 shown in the binding site. (SA) The phosphate-binding loop lid, which forms upon UDP-GlcNAc binding, is 
shown as a worm. (8B) The loop is shown as a surface. (8C) The surface has been colored according to 
potential. Basic potential is shown in blue, and acidic potential is shown in red; the loop is shown as a worm. 
(8D) As in 8C, but with the loop shown as a surface. (8E) The surface has been colored according to residue 
AMAS conservation index. Red regions are conserved, white are unconserved; the loop is shown as a worm. 
15 (8F) As in 8E, but with the loop shown as a surface. 

Figure 9 are diagrams showing the active site of the GnT-1 enzyme. Asp291 is shown as a stick figure 
on the left side of the pocket, while the rest of the protein is shown as a surface. UDP-GlcNAc is shown as a 
stick figure on the right. Mn^* has been shown as a sphere. (9A) The loop is shown as a worm. (9B) The loop 
is shown as a surface. Note the mannose-sized active site pocket. 
20 Figure 10 is a surface diagram of GnT-I bound to the model of the Man5GIcNAc2 acceptor. UDP- 

GlcNAc is shown as a stick, and the Mn^^ has been shown as a sphere. (I OA) The acceptor model is shown as a 
stick figure. (I OB) The acceptor has been shown as a space-filling van der Waals figure. 

Figure 1 1 is the same as Figure 10 but from a different angle, showing the fit of the acceptor to the 
surface more visible. (1 1 A) The acceptor has been shown as a stick figure, and the loop as a worm. (IIB) The 
25 acceptor has been shown as a stick figure, and the loop as a surface. (1 IC) The acceptor has been shown as 
space-filling van der Waals spheres, and the loop as a surface. (1 ID) As in 1 IB, but with the surface colored 
according to residue conservation index. Note the correlation of the acceptor model to red conserved residues. 

Figure 12 shows a model of the active site of GnT-1, with the base D291 (i.e. Asp292), the a- 1,3 
mannose 02, and the GlcNAc CI joined by lines of small spheres. The protein backbone has been shown as an 
30 alpha-carbon trace, the acceptor Man5GlcNAc2 sugar, UDP-GlcNAc, and protein side-chains have been shown 
as stick figures, and the Mn^* ion and bound water molecules have been shown as spheres. 

Figure 13 shows a model of the overlay of GnT-1 (red). Bacillus subtilis nucleotide- diphospho-sugar 
transferase (spsA) (green), Escherichia co// N-acetylglucosamine 1 -phosphate uridyltransferase (GlmU) (blue), 
and bovine P-l,4-galactosyltransferase Tl (galT) (cyan). Parts of the protein sequence not in the transferase 
35 fold are shaded a darker color. 

Figure 14 shows the overlay of GnT-1 and GImU from the model of Figure 13. The DALl z-score (a 
measure of structural similarity) for this overlay is 9.6. Dissimilar structures give scores less than 2; greater 
similarity gives a higher score. 
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Figure 15 shows the overlay of GnT-1 and p-l,4-galT from the model of Figure 13. The DALI z- 
score for this overlay is 10.6. 

Figure 16 shows the overlay of GnT-1 and spsA from the model of Figure 13. The DALI z-score for 
this overlay is 15.7. 

5 Figure 17 shows the model of Figure 13 from a different angle. Note the overlay of the helix-loop- 

helix containing the catalytic base Asp residue (Asp 291 in GnT-1, Asp 191 in spsA, and Asp in galT). 
Figure 18 shows the overlay of GnT-l and GhnU from the model of Figure 17, 
Figure 19 shows the overlay of GnT-1 and galT from the model of Figure 17. 
Figure 20 shows the overlay of GnT-1 and spsA from the model of Figure 17. 
10 Figure 21 shows the secondary structure of GnT-l. Helices are in red and p sheets are in green. 

Areas not in the conserved fold are darkened. 

Figure 22 shows the secondary structure of GimU. Helices are in red and p sheets are in green, p 
strand 6 has been deleted. 

Figure 23 shows the secondary structure of galT. Helices are in red and p sheets are in green. P 
15 strand 3 has been deleted along with helix 2 leading into it. Instead, a small P finger N-terminal of the core 
domain and a P finger C-teiminal of the core domain occupy the space of P strand 3. 

Figure 24 shows the secondary structure of spsA. Helices are in red and p sheets are in green. All 
eight strands are present in the core domain. 

Figure 25 is a GnT 1 Ribbon Diagram. Domain 1 is shown in cyan, the loop (residues 318 to 330) 
20 structured upon UDP-GlcNAc binding in red, the linker connecting Domain 1 and Domain 2 in green. Domain 
2 in brown, and the UDP-GIcNAc and the Mn^^ ion are shown in yellow. All molecular images were prepared 
using SPOCK (Christopher, 1998) and rendered using Raster3D (Bacon, 1988 ; Merritt, 1994 ). 

Figure 26 shows the electrostatic potential surface of GnT I, showing the acidic pocket into which the 
Mn^* ion and UDP-GlcNAc bind. Acidic residues are colored red, and basic residues blue, with a gradient 
25 through ± 10 kT. The UDP-GlcNAc is shown in yellow. 

Figure 27 shows a sample of the AMAS analysis. Shown is an excerpt from the AMAS analysis, with 
residues in the region of the "DxD" motif (residues 211 to 213, EDD). GnT I sequences from rabbit human, 
mouse, rat, Chinese hamster, golden hamster, frog, and C.elegans genes gly-12 and gIy-14, were aligned using 
ClustaDC, and conservation was scored using AMAS. Unconserved residues are given a score of "0", and fully 
30 conserved residues are given a score of "A". (SEQ ID NO 1, 2, 3, 4, and 5). 

Figure 28 shows AMAS surface analysis. AMAS residue scores, as shown in Figure 27, were then 
mapped onto the protein surface, with a gradient from green for a completely unconserved score of 0, to white 
for an AMAS score of 5, to red for a fully conserved score of "A". 

Figure 29 shows a stereo ribbon overlay of the SGC domains of GnT 1 (red) and spsA (green). For 
35 clarity only the a-helices are labeled. UDP (spsA) and UDP-GlcNAc (GnTl) are shown in stick representation. 
M and C label the side chains of the metal binding and catalytic aspartic acid residues also shown in stick 
representation 

Figure 30 shows topology diagrams of GnT I, spsA, GImU (an N-acetylglucosamine-1 -phosphate 
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uridyltransferase from Escherichia coli) and P4Gal-Tl (a bovine galactosyl transferase, P-1,4- 
galactosyltransferase 1). Beta strands are shown as green triangles, and alpha helices as red circles, with 
missing elements shown in white. The secondary structural elements are labeled as in GnT I. The boxed gray 
region corresponds to the SGC domain.. 
5 Figure 31 shows a structural alignment of GnT I, spsA, GlmU and P4Gal-Tl, (SEQ ID NO 6, 7, 8, 

and 9). Shown are two excerpts from the complete alignment, numbered according to rabbit GnT 1. In the top 
alignment, the region around the DxD motif is shown, with the motif highlighted in magenta. In the bottom 
alignment, the area around the catalytic Asp is shown, with the catalytic residue again highlighted in magenta. 
Note that GImU is not a glycosyltransferase, but rather an N-acetylglucosamine-1 -phosphate uridyl transferase, 
10 so it does not share the catalytic residue found in GnT I, spsA, and p4Gal-Tl . 

Figure 32 shows the GnT I substrate binding site. All interactions between the protein, the UDP- 
GIcNAc, the Mn^* ion, and structured waters are shown as lines composed of small white spheres. 

Figure 33A, B, and C show a stereo view of the UDP-GlcNAc/Mn"* binding site. Carbon, oxygen, 
nitrogen, sulfur, and phosphorus are colored white, red, blue, yellow and purple respectively; water molecules 
15 are cyan and the Mn^^ ion is salmon. Hydrogen bonds are shown as dotted lines. The CI of the 

acetylglucosamine moiety is labeled for reference. 33A Uracil and ribose interactions; 33B) Mn^"" and 
phosphate interactions; 33C) A^-acetylglucosamine interactions. 

Figure 34 shows interactions between GnT i, the Mn ^* ion, and the UDP-GlcNAc phosphates. Rl 17 
is from the N-terminus of helix al, E21 1 and D213 are from thi C-terminus of strand P4, T315 and G317 are 
20 from strand p8' and the N-terminus of the loop lid, and V321 and S322 are from the tip of the loop lid. 

Figure 35 shows interactions between GnT I and the GIcNAc group of UDP-GlcNAc. Residue Y184 
is in helix a3, residue E21 1 is in strand P4, residue L269 is from the C-terminus of strand P7, residues F289, 
W290, D291 and R295 are from helix a6, and L331 is from the C-terminal end of the loop lid. D291 is the 
only Asp that is close enough to the GlcNAc CI to act as the catalytic base. 
25 Figure 36 shows GnT I overlaid on spsA: GnT I appears in red, and spsA in green. In this Figure, the 

position of the ligands is shown. GnT I is bound to UDP-GlcNAc, shown as a red stick figure, along with a 
Mn^^ ion, shown as a red sphere near Asp213; spsA is bound to UDP, shown as a green stick figure, along with 
a Mn^* ion, shown as a green sphere near Asp 99. Note how the nucleotides and proteins overlay very closely. 
The catalytic base residue in GnT I, Asp 291, identified by this structure, has an analogous residue in spsA. 
30 Asp 191. This predicts that Asp 191 is the catalytic base in spsA. The catalytic base was not identifiable with 
the spsA x-ray crystal structure alone, due to the absence in the spsA structure of the sugar residue normally 
attached to the UDP. 

Figure 37 shows GnT I overlaid on pi, 4-galT: GnT 1 appears in red, and P-l,4-galT in cyan. Again, 
the ligands of GnT I are shown, as in Figure 36. The ligand in the P-l,4-galT x-ray crystal structure, UDP, is 
35 shown as a stick figure; the Mn^+ normally required in the reaction is absent, as is the sugar part of the donor 
sugar-nucleotide. Again, GnT I's Asp 291 has an analogous galT residue, Asp318. This predicted to be the 
catalytic base in p-l,4-galT by the GnT I structure. 

Figure 38 shows GnT I overlaid on GlmU: GnT I appears in red, and GimU in navy blue. The ligands 
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of GnT I are shown in red, as in Figures 36 and 37. The GlmU product, UDP-GlcNAc, is shown as a navy 
blue stick figure. As GlmU is not a transferase, these two enzymes do not catalyse the same reaction, and thus 
the residues involved in enzymatic action are expected to be different. However, the similar fold and similar 
location of sugar-nucleotide binding, suggests that these enzymes may have evolved from an extremely distant 
5 conmion ancestor. 

Figure 39 shows the DxD Motif. Atom colors and labels are as in Figure 33. Letters i, to i+3 
correspond to the residues of the type I P-tum. The hydrogen bond characteristic of this turn type is shown in 
green. 

Figure 40A and 40B shows a stereo diagram of the structured loop and the acceptor binding pocket. 
10 Atom colors and labels as in Figure 33. Backbone tubes and molecular surfaces are color coded as follows: 
red, structured loop; green, linker region; cyan, Domain 1; brown, Domain 2. 40A) Structured loop and UDP- 
GlcNAc/Mn^* interactions; 40B) Surface representation of the acceptor binding pocket. The side chain of the 
catalytic base (D291) and the A^-acetylglucosamine moiety of the UDP-GIcNAc are seen at the base of the 
pocket. 

15 DETAILED DESCRIPTION OF THE INVENTION 
Summary of Tables 1 to 8 

Table 1- structural coordinates of an N-acetylglucosaminyl transferase I (GnT-1) native strucmre. 
Table 2 -structural coordinates of a GnT-1 with bound MeHg . 

Table 3 -structural coordinates of a rabbit GnT-1 bound to UDP-GlcNAc and a manganese 2+ ion. 
20 Table 4 — structural coordinates of a GnT- 1 with acceptor. 

Table 5 - Intennolecular Contacts of GnT-l-UDP-GicNAc Complex. 

Table 6 - crystallographic data and refmement statistics. 

Table 7 - The UDP-GlcNAc binding site. 

Table 8 - Protein threading results. 
25 In Tables I to 4, from the left, the second column identifies the atom number; the third identifies the 

atom type; the fourth identifies the amino acid type; the fifth identifies the residue number; the sixth identifies 

the X coordinates; the seventh identifies the y coordinates; the eighth identifies the z coordinates; the ninth 

identifies the occupancy; and the eleventh identifies the temperature factor. 

Definitions: 

30 Unless otherwise indicated, all terms used herein have the same meaning as they would to one skilled 

in the art of the present invention. Practitioners are particularly directed to Current Protocols in Molecular 

Biology (Ansubel) for definitions and terms of the art. 

"Glycosyltransferase structure" or "glycosyltransferase secondaiy or three-dimensional structure" 

refers to the three-dimensional structure (i.e. tertiary structure) or arrangement of secondary structural elements 
35 of a purified polypeptide comprising a glycosyltransferase. A glycosyltransferase structure may be in 

association with or complexed with a moiety including a heavy metal atom or metal cofactor. A 

glycosyltransferase structure may be in crystalline form. 
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The term "crystalline form" in the context of the invention, is a crystal formed from an aqueous 
solution comprising a purified polypeptide comprising a glycosyltransferase. The glycosyltransferase is , 
preferably a glycosyltransferse with an SGC domain, including but not limited to a glycosyltransferase 
structurally related to N-acetylglucosaminyltransferase I to VIII, preferably N-acetylglucosaminyltransferase I. 
5 A crystalline form of a glycosyltransferase, is characterized as being capable of diffracting x-rays in a pattern 
defined by one of the crystal forms depicted in Blundel et al 1976, Protein Crystallography, Academic Press. A 
ciystaliine form may include a crystal structure in association with one or more moieties, including heavy- 
metal atoms i.e. a derivative crystal, or one or more compounds i.e. a co-crystal. 

The term "associate", "association" or "associating" refers to a condition of proximity between a 
10 moiety (i.e. chemical entity or compound or portions or fragments thereof), and a glycosyltransferase, or parts 
or fragments thereof (e.g. binding sites or domains). The association may be non-covalent i.e. where the 
juxtaposition is energetically favored by for example, hydrogen-bonding, van der Waals, or electrostatic or 
hydrophobic interactions, or it may be covalent. 

The term "heavy-metal atoms" refers to an atom that can be used to solve an x-ray crystallography 
15 phase problem, including but not limited to a transition element, a lanthanide metal, or an actinide metal. 
Lanthanide metals include elements with atomic numbers between 57 and 71, inclusive. Actinide metals 
include elements with atomic numbers between 89 and 103, inclusive. 

A "metal cofactor" refers to a metal ion required for a glycosyltransferase to transfer the selected 
sugar from the sugar nucleotide donor to the acceptor. For example, the metal cofactor for N- 
20 acetylglycosyltransferase may be a divalent cation like manganese, or magnesiimi, and other similar atoms or 
metals. 

The term "glycosyltransferase" refers to an enzyme that catalyzes the transfer of a single 
monosaccharide unit from a donor to the hydroxyl group of an acceptor substrate. The acceptor can be either a 
free saccharide, glycoprotein, glycolipid, or polysaccharide. The donor can be a nucleotide-sugar, or dolichol* 

25 phosphate-sugar. Glycosyltransferases show a precise specificity for both the sugar acceptor and donor and 
generally require the presence of a metal cofactor. The term "glycosyltransferase" also encompasses 
polypeptides comprising a SGC domain. 

Glycosyltransferases include but are not limited to eukaryotic glycosyltransferases involved in the 
biosynthesis of glycoproteins, glycolipids, glycosylphosphatidylinositols and other complex glycoconjugates, 

30 and prokaryotic glycosyltransferases involved in the synthesis of carbohydrate structures of bacteria and 
viruses, including enzymes involved in LOS and lipopolysaccharide biosynthesis. Examples of 
glycosyltransferases include N-acetylglucosaminyltransferases, including N-acetylglucosaminyltransferases I 
through VIII involved in the biosynthesis of complex and hybrid N-glycans; UDP-N-acetylglucosamine:N- 
aceiyl galactosamine pi,6-N-acetylglucosaminyl transferases (core 2 GlcNAc transferases); Core 3 GlcNAc 

35 transferase. Core 4 GlcNAc transferase; Corel and Core 2 elongation glycosyltransferases involved in the 
biosynthesis of O-glycans and the glycosyltransferases involved in the biosynthesis of antigen determinants 
(blood group i and blood group I); and structurally related proteins. 

The enzyme at the gateway from high-mannose structures to hybrid and complex A^-glycans is UDP- 
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7^-acetyIglucosamine:a-3-D-inannoside P-1.2-A^-acefylglucosaminyItransferase I [GnT I; E.G. 2.4.1.101; 
Harpaz and Schachter, 1980; Narasimhan et al, 1977; Stanley et al, 1975; See GenBank M61829 and M55621 
(human) and M57301 (rabbit) for nucleic acid and amino acid sequences]. It transfers the first ^- 
acetylglucosamine residue onto the high-mannose core and all other enzymes in the hybrid and complex 
pathway depend on its prior action (Schachter, 1986; Schachter, 1991). GnT I plays a fundamental role in 
mammalian development, as shown by knockout studies in mice (lofFe, 1994; Metzler, 1994 ). Moreover, 
mutation oi- misregulation of several of the enzymes dependent on GnT I action are associated with human 
disease and metastasis (Jaeken et al, 1994,; Charuk et al, 1995; Jaeken et al , 1993; Granovsky et al, 2000; Tan 
etal, 1996). 

Glycosyltransferases have been classified into 44 different families, based on both sequence similarity 
and substrate/product stereochemistry (inverting or retaining) (Campbell et al, 1997; Campbell et al, 1998; 
Coutinho and Henrissat, 1999). GnT 1 (family 13) is an inverting glycosyltransferase: the a-linked GlcNAc 
moiety from the UDP-a-GlcNAc donor is transferred to the 3-arm of the Man5GlcNAc2 acceptor, creating the 
P-linked GlcNAc-p-l,2-Man-R product (Reck et al, 1994). 

As applied to polypeptides, the term "substantial sequence identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap, share at least 
40%, 50%, 60%, 65%, 70%, 75%, 80%, or 85% sequence identity, preferably at least 90 percent sequence 
identity, more preferably at least 95 percent sequence identity or more. Preferably, residue positions which are 
not identical differ by conservative amino acid substitutions. For example, the substitution of amino acids 
having similar chemical properties such as charge or polarity are not likely to effect the properties of a protein. 
Examples include glutamine for asparagine or glutamic acid for aspartic acid. 

The term "mutant" refers to a polypeptide that is obtained by replacing at least one amino acid residue 
in a native glycosyltransferase with a different amino acid residue. Mutation can also be accomplished by 
adding and/or deleting amino acid residues within the native glycosyltransferase or part thereof A mutant may 
or may not be functional. 

The term "function" refers to the ability of a modulator to enhance or inhibit the association between 
a glycosyltransferase and a compound, or the activity of the glycosyltransferase. 

"Modulator** refers to a molecule which changes or alters the biological activity of a 
glycosyltransferase. A modulator may increase or decrease glycosyltransferase activity, or change its 
characteristics, or functional or immunological properties. It may be an inhibitor that decreases the biological 
or immunological activity of the protein. Modulators include but are not limited to peptides, members of 
random peptide libraries and combinatorial chemistry-derived molecular libraries, phosphopeptides (including 
members of random or partially degenerate, directed phosphopeptide libraries), antibodies, carbohydrates, 
nucleosides or nucleotides or parts thereof, and small organic or inorganic molecules. A modulator may be an 
endogenous physiological compound, or it may be a natural or synthetic compound. 

The term "atomic structural coordinates" or "structural coordinates" as used herein refers to a data set 
that defines the three dimensional structure of a molecule or molecules (e.g. Cartesian coordinates, temperature 
factors, and occupancies). Structural coordinates can be slightly modified and still render nearly identical three 
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dimensional structures. A measure of a unique set of structural coordinates is the root-mean-square deviation 
of the resulting structure. Structural coordinates that render three dimensional structures ( in particular a three 
dimensional structure of an SGC domain) that deviate from one another by a root-mean-square deviation of 
less than 5 A, 4 A, 3 A, 2 A, or 1 .5 A may be viewed by a person of ordinary skill in the art as very similar. 
5 The term "unit cell" refers to the smallest and simplest volume element (i.e. parallelpiped-shaped 

block) of a crystal that is completely representative of the unit of pattern of the crystal. The unit cell axial 
lengths are represented by a, b, and c. Those of skill in the art understand that a set of atomic coordinates 
determined by X-ray ciystaliography is not without standard error. 

The term *'space group** refers to the lattice and symmetry of the crystal. In a space group designation 

10 the capital letter indicates the lattice type and the other symbols represent symmetry operations that can be 
carried out on the contents of the asymmetric unit without changing its appearance. 

The term "purified" in reference to a polypeptide, does not requu-e absolute purity such as a 
homogenous preparation rather it represents an indication that the polypeptide is relatively purer than in the 
natural environment. Generally, a purified polypeptide is substantially free of other proteins, lipids, 

15 carbohydrates, or other materials with which it is naturally associated, preferably at a functionally significant 
level for example at least 85% pure, more preferably at least 95% pure, most preferably at least 99% pure. A 
skilled artisan can purify a polypeptide comprising a glycosyltransferase using standard techniques for protein 
purification. A substantially pure polypeptide comprising a glycosyltransferase will yield a single major band 
on a non-reducing polyaciylamide gel. The purity of the glycosyltransferase can also be determined by amino- 

20 terminal amino acid sequence analysis. 

A ^'sugar nucleotide donor" refers to a nucleotide coupled to a selected sugar that is transferred by a 
glycosyltransferase to an acceptor. The selected sugar may be a monosaccharide. A suitable selected sugar 
includes N-acetyl glucosamine (GlcNAc). The N-acetyl glucosamine may be modified for example, the 
hydroxyls may be blocked with acetonide, acylated, or alkylated or substituted with other groups such as 

25 halogen. For N-acetylglucosaminyitransferases the nucleotide is preferably UDP. For other en^^nes, the 
nucleotide may be GDP (fucosyitransferases and mannosyltransferases), or CMP (sialyltransferases). The 
heterocyclic amine base in the nucleotide may be modified. For example, when the base is uridine it may be 
modified at the C-5 or C-6 position with groups including but not limited to alkyl, aiyl, and electron donating 
and electron withdrawing groups. The sugar in the nucleotide (e.g. ribose) may be modified at the 2' or 3' 

30 position with groups including but not limited to alkyl, aryl, and electron donating and electron withdrawing 
groups. 

"Acceptor" refers to the part of a carbohydrate structure (e.g. glycoprotein, glycolipid) where the 
selected sugar is transferred by the glycosyltransferase. The acceptor may comprise MansGlcNAca-. 

Abbreviations for amino acid residues are the standard 3-letter and/or 1 -letter codes used in the art to 
35 refer to one of the 20 common L-amino acids. 
Glycosyltransferase Structures 

The present invention provides a secondary or three-dimensional structure of a glycosyltransferase or 
part thereof (e.g. binding site or domain). In an embodiment the structure is a crystalline form. A 
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glycosyltransferase structure may comprise a glycosykransferase in a unit cell. In an embodiment, a 
glycosyltransferase is arranged in a crystallline manner in a space group P2i2|2i so as to form a unit cell of 
dimensions a= 40.4 ± 3.0 A, b= 82.4 ± 3.0 A, c= 102.5 ± 3.0 A, a=P=y= 90°, and which effectively diffracts 
X-rays for determination of the atomic coordinates of a glycosyltransferase. The secondary and three- 
5 dimensional structure of a preferred glycosyltransferase of the invention is illustrated by the N-acetyl 
glucosaminyl transferase I (GnT 1) structure specifically described herein. A glycosyltransferase structure may 
be defined by the structural coordinates of Tables 1, 2, 3, or 4. 

A glycosyltransferase structure includes the secondary or three-dimensional structure of a native 
glycosyltransferase, a derivative glycosyltransferse, or a mutant glycosyltransferase. Thus, a crystalline form 

10 includes native crystals, derivative crystals, and co-crystals. The crystals generally comprise a substantially 
pure glycosyltransferase in crystalline form. It is understood that the glycosyltransferase structures of the 
invention are not limited to a naturally occurring or native glycosyltransferases but include polypeptides 
comprising an SGC domain, or polypeptides with substantial sequence identity to a glycosyltransferase. A 
glycosyltransferase structure also includes mutants of a native glycosyltransferase obtained by replacing at 

15 least one amino acid residue in a native glycosyltransferase with a different amino acid residue, or by adding or 
deleting amino acid residues within the native polypeptide, and having substantially the same secondary or 
three-dimensional structure as the native glycosyltransferase from which the mutant is derived i.e. having a set 
of atomic structural coordinates that have a root mean square deviation of less than or equal to about 5, 4, 3, 2, 
or 1.5 A when superimposed with the atomic structure coordinates of the native glycosyltransferase from which 

20 the mutant is derived when at least 50% to 100% of the atoms of the native glycosyltransferase domain arc 
included in the superimposition. It should be noted that the glycosyltransferase structures contemplated herein 
need not exhibit glycosyltransferase activity. 

A derivative glycosyltransferase structure of the invention comprises a glycosyltransferase structure in 
association with one or more moieties that are heavy metal atoms. For example, derivative crystals of the 

25 invention generally comprise a crystalline glycosyltransferase in covalent association with one or more heavy 
metal atoms. The glycosyltransferase may correspond to a native or mutated glycosyltransferase. Heavy metal 
atoms useful for providing derivative glycosyltransferase structures include by way of example, and not 
limitation, gold, mercury, etc. 

The invention features a glycosyltransferase structure in association with one or more moieties that 

30 arc compounds (e.g. UDP-GlcNAc, uridine-ribose, phosphate-Mn^*, MansGlcNAcj-, one or more metal 
cofactors). The association may be covalent or non-covalent. Crystalline forms of this type are referred to 
herein as co-crystals. The compound may be any organic molecule, and it may modulate the function of a 
glycosyltransferase by for example inhibiting or enhancing its function, or it may be an acceptor, donor, or 
metal cofactor for the glycosyltransferase. It is preferred that the geometry of the compound and the 

35 interactions fomfied between the compound and the glycosyltransferase provide high affmity binding between 
the two molecules. 

The secondary or three-dimensional structures of the particular glycosyltransferases described herein 
provide useful models for the secondary or three-dimensional structures of glycosyltransferases from any 
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species, particularly mammalian, including bovine, ovine, porcine, murine, equine, preferably human, from any 
source whether natural, synthetic, semi-synthetic, or recombinant. 
Binding Sites - The GnT-1 active site 

JV-acetylglucosaminyltransferase I catalyses the addition of a p-1,2 GlcNAc onto the a- 1,3 arm of the 
5 Man5GlcNAc2 N-Iinked carbohydrate moiety. The structure has allowed the identification of the binding site of 
UDP-GlcNAc, identification of the reaction centre, and the development of a working model of the 
MansGlcNAci-acceptor binding site that correlates with biochemical reaction inhibition evidence. 

The UDP-GIcNAc binding site can be subdivided into three sub sites: the uridine-ribose binding sub 
site, the phosphate-Mn^^ binding sub site, and the GlcNAc sub site. 
10 In the uridine-ribose sub site, there are three direct hydrogen bonds between the protein and the 

nucleotide sugar. Aspl44 interacts with the uridine N3, the Hisl90 NDl interacts with the uridine 02, and 
Asp212 binds the ribose 03. In addition, there is one water-mediated bond between Asp212 and the ribose 
02. Meanwhile, the uridine base makes van der Waals interactions with He 187, as well as the cysteine bridge 
between Cysl 15 and Cysl45. 

15 The phosphate-Mn^* site is the subject of many interactions between the nucleotide sugar and protein; 

in fact, while the manganese co-ordination site lies on the enzyme's surface, a majority of the interactions with 

the phosphates come from a loop which structures itself on top of the substrate upon binding. 

The protein itself has only one direct co-ordination bond to the Mn^\ via Asp2I3; since two of the six 

co-ordination points are taken up by the phosphate oxygens (one from each phosphate), the final three points 
20 are bound by water. These waters are then bound by the Thr315 OG, the Gly317 carbonyl oxygen, GIu211 

and Asp213. 

The phosphate groups make one direct hydrogen bond to Arg317NH on the protein's rigid surface, 
wiiile making three hydrogen bonds with the flexible loop which rigidifies into a lid on top of the phosphate- 
Mn^ subsite. These loop interactions are with the Val321 backbone N and the backbone N and OG of Ser322. 

25 In addition, a two-water hydrogen-bonding bridge leads to Aspl 16. 

In contrast to the previous two sub sites, which hold the UDP-GlcNAc rigidly in place, the GlcNAc- 
binding sub site must allow the sugar ring enough flexibility to go through the flat penta-coordinate CI 
intermediate. Three direct hydrogen bonds are made: two between the GlcNAc 04 and Asp211 and Trp290, 
anchoring the 04-Cl axis of the GlcNAc m place, and one between the GlcNAc 03 and Asp21 1, establishing 

30 the correct pucker for the sugar ring. One water bridge also exists between the sugar and the protein; the 
GlcNAc 06 hydrogen bonds to a water molecule held in place by the amide nitrogens of Phe289 and Trp290, 
along with the carbonyl oxygen of Tyr 1 84. 

The acetyl group methyl makes van der Waals contact with Leu269 and Leu331, leaving the acetyl 
group 07 and N2, along with the GlcNAc ring 05, unbonded. This lack of interaction may give the C2 and 

35 05 enough flexibility to make the movements necessary for the CI to achieve the reaction intermediate sn2 
conformation. 
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This experimental nucleotide-sugar binding conformation has allowed the identification of the base in 
the reaction: Asp291 is located just over 5 A from the GlcNAc CI, putting it in a perfect position to perform 
this role. 

The identification of the reaction centre and binding site has provided the constraints necessary to 
5 make a theoretical model of the Man5GlcNAc2-acceptor binding site. The a- 1,3 mannose 02 is placed 
between the Asp291 ODl and the GlcNAc CI, putting it into position for the nucleophilic attack on the CI. 
Hiis positioning forces the conformation of the rest of the mannose: the 03 forms hydrogen bonds to Asp291, 
Arg295, and a structured water held in place by Arg415; the 04 hydrogen bonds to the same structured water 
as the 03; and the 06 hydrogen bonds to both a UDP-GIcNAc phosphate, as well as the OG of Ser322, a 

10 phosphate-binding lid loop residue. This a- 1,3 marmose orientation corresponds with biochemical evidence 
that all of the mannose ring's hydroxyl groups are important. In addition, this model supports the ordered- 
sequential reaction sequence, as the GlcNAc is buried below the Man5GIcNAc2, as well as further evidence 
^lat the Man5GIcNAc2 binding site is partially fonned upon GnT-l's UDP-GlcNAc binding. 

The MansGlcNAca core mannose position is also constrained by the reaction centre location: the 04 

15 hydrogen bonds to asp291, the ring is in van der Waals contact with Phe289 and Tyrl84, and the 06 hydrogen 
bonds to Asp292. Again, this position supports the known biochemistry, as the 04 is important, the 02 is 
unimportant, and the P Ol linkage is required to allow the ring to sit against the protein surface. An a Ol 
would clash with the protein, and may break up the important lectin-like van der Waals interaction with the 
phenylalanine. 

20 Finally, the model allows the positioning of the a-1,6 mannose, the a-1,3 and a-1,6 mannoses 

attached to it, and the chitobiosyl-core GlcNAc^. The positions of these sugar rings in the model correspond 
with the location of conserved GnT-1 surface residues; biochemical evidence states that these sugars are less 
important to Man5GIcNAc2 binding, and thus their position is less well defmed than the a- 1,3 arm and core 
mannose. 

25 In summary, the A^-acetylglucosaminyltransferase I structure has allowed the exact identification of 

the UDP-GlcNAc binding site, along with the reaction centre, and allowed the prediction of the Man3GlcNAc3- 
acceptor binding site. This UDP-GlcNAc-boimd, closed-loop GnT-1 structure is critical for the design of 
high-affinity inhibitors to the activity of GnT-1. 

Therefore, the invention contemplates a secondary or three-dimensional structure of a binding site of 

30 a glycosyltransferase. Binding sites include the binding site for a disphosphate group of a sugar nucleotide 
donor, a nucleotide of a sugar nucleotide donor, a nitrogeneous heterocyclic base (preferably a pyrimidine 
base, more preferably uracil) of a sugar nucleotide donor, a sugar of the nucleotide of a sugar nucleotide donor, 
a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, and/or an acceptor. A three 
dimensional structure of a binding site may be defined by selected atomic contacts, preferably the enzyme 

35 atomic contacts as defined in Table 5. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding site of a 
glycosyltransferase that associates with a diphosphate of a sugar nucleotide donor (or the secondary or three- 
dimensional structure of a complex of the binding site with the diphosphate) is provided comprising at least 
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two or three atomic contacts of atomic interactions 8, 9, and 10 in Table 5, each atomic interaction defined 
therein by an atomic contact (more preferably, a speciHc atom where indicated) on the diphosphate group, and 
an atomic contact (more preferably, a specific amino acid residue where indicated) on the glycosyltransferase 
(i.e. enzyme atomic contact). The binding site may be defined by the enzyme atomic contacts of atomic 
5 interactions 8 and 9; 8 and 10; 9 and 10; or 8, 9, and 10 in Table 5. Preferably, the binding site is defmed by 
the atoms of the enzyme atomic contacts having the structural coordinates for the atoms listed in Table 1, 2, 3, 
or 4. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding site of a 
glycosyltransferase that associates with a heterocyclic amine base (preferably uracil) of a sugar nucleotide 

10 donor (or the secondary or three-dimensional structure of a complex of the binding site with a heterocyclic 
amine base) is provided comprising at least two, three, or four atomic contacts of atomic interactions 1, 2, 3, 4, 
and 5 in Table 5, each atomic interaction defmed therein by an atomic contact (more preferably, a specific 
atom where indicated) on the heterocyclic amine base, and an atomic contact (more preferably, a specific 
amino acid residue where indicated) on the glycosyltransferase (i.e. enzyme atomic contact).. The binding site 

15 may be defined by the enzyme atomic contacts of atomic interactions 1, 2, and 3; 2, 3, and 4; 3, 4, and 5; 1,2, 
and 4; 1,2, and 5; 1, 3, and 4; 1, 3, and 5; 2. 3, and 5; 2, 4, and 5; 1, 2, 3, and 4; 1,2. 3, and 5; 2, 3, 4, and 5; 1, 
3, 4, and 5; or 1, 2, 3, 4, and 5 in Table 5. Preferably, the binding site is defined by the atoms of the enzyme 
atomic contacts having the structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding cavity of a 

20 glycosyltransferase that associates with the sugar of the nucleotide (preferably ribose) of a sugar nucleotide 
donor (or a secondary or three-dimensional structure of a complex of the binding site with the sugar) is 
provided comprising the atomic contacts of atomic interactions 6 and 7 in Table 5, each atomic interaction 
defined therein by an atomic contact (more preferably, a specific atom where indicated) on the sugar, and an 
atomic contact (more preferably, a specific amino acid residue where indicated) on the glycosyltransferase (i.e. 

25 enzyme atomic contact). The binding site may be defined by the enzyme atomic contacts of atomic interactions 
6 and 7 in Table 5. Preferably, the binding site is defmed by the atoms of the enzyme atomic contacts in the 
binding site having the structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

In an embodiment of the invention, a secondary or three-dimensional structure of a binding cavity of a 
glycosyltransferase that associates with a selected sugar (GlcNAc) of a sugar nucleotide donor (or a secondary 

30 or three-dimensional structure of a complex of the binding site with the selected sugar) is provided comprising 
at least two, three, four, five, six, seven, or eight atomic contacts selected from the atomic contacts of atomic 
interactions 14, 15, 16, 17, 18, 19, 20, and 21 in Table 5, each atomic interaction defined therein by an atomic 
contact (more preferably, a specific atom where indicated) on the selected sugar, and an atomic contact (more 
preferably, a specific amino acid residue where indicated) on the glycosyltransferase (i.e. enzyme atomic 

35 contact).. The binding site may be defined by the en^rme atomic contacts of atomic interactions 14, 18, and 
19; 14, 20, and 21; 14, 15, 16. and 17; 18, 19, 20, and 21; and 14 through 21 in Table 5. Preferably, the 
binding site is defined by the atoms of the enzyme atomic contacts in the binding site having the structural 
coordinates for the atoms listed in Table 1 , 2, 3, or 4. 
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In an embodiment of the invention, a secondary- or three-dimensional structure of a binding cavity of 
a glycosyltransferase that associates with a nucleotide (preferably UDP) of a sugar nucleotide donor (or a 
secondary or three-dimensional structure of a complex of the binding site and nucleotide) is provided 
5 comprising at least two, three, four, five, six, seven, or eight, nine or ten atomic contacts of atomic interactions 
1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 in Table 5, each atomic interaction defined therein by an atomic contact (more 
preferably, a specific atom where indicated) on the nucleotide, and an atomic contact (more preferably, a 
specific amino acid residue where indicated) on the glycosyltransferase (i.e. enzyme atomic contact). The 
binding site may be defined by enzyme atomic contacts of atomic interactions 1 , 2, 6, 7, 8, 9, and 1 0; 3, 4, 6, 7, 

10 8, 9, and 10; and 1 through 10 in Table 5. Preferably, the binding site is defined by the atoms of the enzyme 
atomic contacts in the binding site having the structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

In an embodiment of the invention, a secondary- or three-dimensional structure of a binding cavity of 
a glycosyltransferase that associates with a sugar nucleotide donor (e.g. UDP-GlcNAc) (or a secondary or 
three-dimensional structure of a complex of the binding site with the sugar nucleotide donor) is provided 

15 comprising at least two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, 
sixteen, seventeen, eighteen, nineteen, twenty, or twenty-one atomic contacts of atomic interactions 1, 2, 3, 4, 
5, 6. 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and 21 in Table 5, each atomic interaction defined 
therein by an atomic contact (more preferably, a specific atom where indicated) on the sugar nucleotide donor, 
and an atomic contact (more preferably, a specific amino acid residue where indicated) on the 

20 glycosyltransferase (i.e. enzyme atomic contact). The binding site may be defined by en2yme atomic contacts 
of atomic interactions 1,2, 6. 7, 8, 9, 10, 14, 15, 18, and 20; 1, 2. 6, 7, 8. 9, 10, 14, 16, 17, 19, and 21; 3, 4, 6, 
7, 8, 9, 10, 14, 15, 18, and 20; 3, 4, 6, 7, 8, 9, 10, 14, 16, 17. 19, 21; or 1 through 21 listed in Table 5. 
Preferably the binding site is defined by the atoms of the enzyme atomic contacts in the binding site having the 
structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

25 A glycosyltransferase structure may be characterized by a 'loop" structure. The loop folds on top of 

the pyrophosphate after the sugar nucleotide donor associates with the active site of the glycosyltransferase. 
Molecules that associate with the loop are highly specific inhibitors of the enzymes. In an embodiment of the 
invention, a secondary or three-dimensional structure of a loop structure of a glycosyltransferase that binds a 
pyrophosphate of a sugar nucleotide donor is provided comprising at least two, three, four, five, six, or seven 

30 atomic contacts of atomic interactions 11, 12, 13, 23, 24, 25, and 27 in Table 5. The binding site may be 
defined by enzyme atomic contacts 11,12, and 13; 1 1, 12, 13 and 27; 23, 24, 25. and 27; or 1 1, 12, 13, 23, 24, 
25, and 27 in Table 5.Preferably, the binding site is defined by the atoms of the enzyme atomic contacts in the 
binding site have the structural coordinates for the atoms listed in Table 1, 2, 3, or 4. 

A secondary or three-dimensional structure of a binding site of a glycosyltransferase that associates 

35 with an Man5GlcNAc2-acceptor (or a secondary or three dimensional structure of a complex of the binding site 
with the acceptor) is also provided comprising at least two, three, four, five, or six atomic contacts of atomic 
interactions 22, 23, 24, 25, 26, and 27 in Table 5, each atomic interaction defined therein by an atomic contact 
(more preferably, a specific atom where indicated) on the acceptor, and an atomic contact (more preferably, a 
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specific amino acid residue where indicated) on the glycosyitransferase (i.e. enzyme atomic contact). The 
binding site may be defined by enzyme atomic contacts of atomic interactions 22, 23, and 24; 23, 24, and 25; 
24, 25, and 26; 25, 26, and 27; 22, 23, 24, and 25; 23, 24, 25, and 26; 24, 25, 26, and 27; 22, 23, 24, 25, and 
26; 23, 24, 25, 26, and 27; and 22 through 27 in Table 5. Preferably, the binding site is defined by the atoms of 
5 the enzyme atomic contacts in the binding site having the structural coordinates for the atoms listed in Table 1, 
2, 3, or 4. 

Method for Preparing Crystal Forms of a Giycosyltransferase 

The invention also features a method for creating crystalline glycosyitransferase structures described 
herein. The method may utilize a polypeptide comprising a glycosyitransferase described herein to form a 

10 crystal. A polypeptide used in the method may be chemically synthesized in whole or in part using techniques 
that are well-known in the art. Alternatively, methods are well known to the skilled artisan to construct 
expression vectors containing the native or mutated glycosyitransferase coding sequence and appropriate 
transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, 
synthetic techniques, and in vivo recombination/genetic recombination. See for example the techniques 

15 described in Sambrook et al. (Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor 
Laboratory press (1989)), and other laboratory textbooks. (See also Sarker et al, Glycoconjugate J. 7:380, 
1990; Sarker et al, Proc. Natl. Acad, Sci. USA 88:234-238, 1991, Sarker et al, Glycoconjugate J. 1 1: 204-209, 
1994; Hull et al, Biochem Biophys Res Commun 176:608, 1991 and Pownall et al. Genomics 12:699-704, 
1992). 

20 Crystals are grown from an aqueous solution containing the purified glycosyitransferase polypeptide 

by a variety of conventional processes. These processes include batch, liquid, bridge, dialysis, vapor diffusion, 
and hanging drop methods, (See for example, McPherson, 1982 John Wiley, New York; McPherson, 1990, 
Eur. J. Biochem. 189: 1-23; Webber. 1991, Adv. Protein Chem. 41:1-36). Generally, the native crystals of the 
invention are grown by adding precipitants to the concentrated solution of the glycosyitransferase polypeptide. 

25 The precipitants are added at a concentration just below that necessary to precipitate the protein. Water is 
removed by controlled evaporation to produce precipitating conditions, which are maintained until crystal 
growth ceases. 

In an embodiment of the invention, the method comprises mixing a volume of a glycosyitransferase 
solution (e.g. 5 mg glycosyitransferase /ml to 15 mg glycosyitransferase /ml. preferably 10 mg/ml) with a 
30 reservoir solution; and equilibrating against the reservoir solution under vapour-difiusion conditions. 

It will be appreciated that the crystallization conditions can be varied and such variations can be used 
alone or in combination. 

Derivative crystals of the invention can be obtained by soaking native crystals in a solution containing 
salts of heavy metal atoms. A complex of the invention can be obtained by soaking a native crystal in a 
35 solution containing a compound that binds the glycosyitransferase, or they can be obtained by co-crystallizing 
the glycosyitransferase polypeptide in the presence of one or more compounds that bind to the 
glycosyitransferase. 
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Once the crystal is grown it can be placed in a glass capillary tube and mounted onto a holding device 
connected to an X-ray generator and an X-ray detection device. Collection of X-ray diffraction patterns are 
well documented by those skilled in the art (See for example, Ducruix and Geige, 1992, IRL Press, Oxford, 
England). A beam of X-rays enter the crystal and diffract from the crystal. An X-ray detection device can be 
5 utilized to record the diffraction patterns emanating from the crystal. Suitable devices include the Marr 345 
imaging plate detector system with an RU200 rotating anode generator. 

Methods for obtaining the three dimensional structure of the crystalline form of a molecule or 
complex are described herein and known to those skilled in the art (see Ducruix and Geige). Generally, the x- 
ray crystal structure is given by the diffraction patterns. Each diffraction pattern reflection is characterized as a 

10 vector arid the data collected at this stage determines the amplitude of each vector. The phases of the vectors 
may be determined by the isomorphous replacement method where heavy atoms soaked into the crystal are 
used as reference points in the X-ray analysis (see for example, Otwinowski, 1991, Daresbury, United 
Kingdom, 80-86). The phases of the vectors may also be determined by molecular replacement (see for 
example, Naraza, 1994, Proteins 1 1:281-296). The amplitudes and phases of vectors from the crystalline form 

15 of a glycosyltransferase, e.g. an N-acetylglucosaminyltransferase I, deteraiined in accordance with these 
methods can be used to analyze other crystalline glycosyltransferases, particularly those with an SGC domain. 

The unit cell dimensions and symmetry, and vector amplitude and phase information can be used in a 
Fourier transform fiinction to calculate the electron density in the unit cell i.e. to generate an experimental 
electron density map. This may be accomplished using the PHASES package (Furey, 1990). Amino acid 

20 sequence structures are fit to the experimental electron density map (i.e. model building) using computer 
programs (e.g. Jones, TA. et al. Acta Crystallogr A47, 100-119, 1991). This structure can also be used to 
calculate a theoretical electron density map. The theoretical and experimental electron density maps can be 
compared and the agreement between the maps can be described by a parameter referred to as R-factor. A high 
degree of overlap in the maps is represented by a low value R-factor. The R-factor can be minimized by using 

25 computer programs that refine the structure to achieve agreement between the theoretical and observed 
electron density map. For example, the XPLOR program, developed by Brunger (1992, Nature 355:472-475) 
can be used for model refinement. 

A three dimensional structure of the molecule or complex may be described by atoms that fit the 
theoretical electron density characterized by a minimum R value. Files can be created for the structure that 

30 defines each atom by coordinates in three dimensions. 
Identification of Homologues 

The knowledge of a glycosyltransferase structure of the invention enables one skilled in the art to 
identify homologues of glycosyitransferases. This is achieved by searches of three-dimensional databases. 
Since structural folds are conserved to a greater extent than sequence, one may identify homologues with very 

35 little sequence identity or similarity. Programs that provide this type of database searching are known in the art 
and include Dali. The structural coordinates of a protein structure are submitted and the program performs a 
muhiple structural alignment with proteins in the protein data bank. Homologues identified in accordance with 
the present invention may be used in the methods of the invention described herein. 
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Methods for Determining Secondary or Three Dimensional Structures 

The structure coordinates of a glycosyltransferase structure described herein can be used as a model 
for determining the secondary or three-dimensional structures of additional native or mutated 
glycosyltransferases with unknown structure, as well as the structures of co-crystals of glycosyitransferases 
5 with compounds such as acceptors, donors (e.g. UDP-GlcNAc or analogues thereof)* and modulators (e.g. 
stimulators or inhibitors). The structure coordinates and models of a glycosyltransferase structure can also be 
used to determine solution-based structures of native or mutant glycosyltransferases. 

Secondary or three-dimensional structure may be determined by applying the strucmral coordinates of 
a glycosyltransferase structure to other data such as an amino acid sequence. X-ray crystallographic dif&action 

10 data, or nuclear magnetic resonance (NMR) data. Homology modeling, molecular replacement, and nuclear 
magnetic resonance methods using these other data sets are described below. 

Homology modeling (also known as comparative modeling or knowledge-based modeling) methods 
develop a three dimensional model from a polypeptide sequence based on the structures of known proteins 
(e.g. native or mutated glycosyltransferases). In the present invention the method utilizes a computer 

15 representation of a glycosyltransferase structure, preferably a three dimensional structure of an N- 
acetylglucosaminyltransferase I, or a complex of same, a computer representation of the amino acid sequence 
of a polypeptide with an unknown structure (additional native or mutated glycosyltransferases, or polypeptides 
comprising an SGC domain), and standard computer representations of the structures of amino acids. The 
method in particular comprises the steps of; (a) identifying structurally conserved and variable regions in the 

20 known structure; (b) aligning the amino acid sequences of the known structure and unknown structure (c) 
generating coordinates of main chain atoms and side chain atoms in structurally conserved and variable regions 
of the unknown structure based on the coordinates of the known structure thereby obtaining a homology 
model; and (d) refining the homology model to obtain a three dimensional structure for the unknown structure. 
This method is well known to those skilled in the art (Greer, 1985, Science 228, 1055; Bundell et al 1988, Eiu-. 

25 J. Biochem. 172, 513; Knighton et al., 1992, Science 258:130-135, 
http://biochem.vt.edu/courses/modeling/homology.hm). Computer programs that can be used in homology 
modeling are Quanta and the Homology module in the Insight II modeling package distributed by Molecular 
Simulations Inc, or MODELLER (Rockefeller University, www.iucr.ac.uk/sinris-top/logical/prg- 
modeller.html). 

30 In step (a) of the homology modelmg method, the known glycosyltransferase strucmre (e.g. structure 

of the N-acetylglucosaminyltransferase I) is examined to identify the structurally conserved regions (SCRs) 
from which an average structure, or framework, can be constructed for these regions of the protein. Variable 
regions (VRs), in which known structures may differ in conformation, also must be identified. SCRs generally 
correspond to the elements of secondary structure, such as alpha-helices and beta-sheets, and to ligand- and 

35 substrate-binding sites (e.g. acceptor and donor binding sites). The VRs usually lie on the siuface of the 
proteins and form the loops where the main chain turns. 

Many methods are available for sequence alignment of known structures and unknown structures. 
Sequence alignments generally are based on the dynamic programming algorithm of Needleman and Wunsch 
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[J. Mol. Biol. 48: 442-453, 1970]. Current methods include FASTA, Smith-Waterman, and BLASTP, with the 
BLASTP method differing from the other two in not allowing gaps. Scoring of alignments typically involves 
construction of a 20x20 matrix in which identical amino acids and those of similar character (i.e., conservative 
substitutions) may be scored higher than those of different character. Substitution schemes which may be used 
5 to score alignments include the scoring matrices PAM (Dayhoff et al, Meth. Enzymol. 91: 524-545, 1983), 
and BLOSUM (Henikoff and Henikoff, Proc. Nat. Acad. Sci. USA 89: 10915-^0919, 1992), and the matrices 
based on alignments derived from three-dimensional structures including that of Johnson and Overington (JO 
matrices) (J. Mol. Biol. 233: 716-738, 1993). 

Alignment based solely on sequence may be used; however, other structural features also may be 

10 taken into account. In Quanta, muhiple sequence alignment algorithms are available that may be used when 
aligning a sequence of the unknown with the known structures. Four scoring systems (i.e. sequence homology, 
secondary structure homology, residue accessibility homology, CA-CA distance homology) are available, each 
of which may be evaluated during an alignment so that relative statistical weights may be assigned. 

When generating coordinates for the unknown structure, main chain atoms and side chain atoms, both 

15 in SCRs and VRs need to be modeled. A variety of approaches known to those skilled in the art may be used to 
assign coordinates to the unknown. In particular, the coordinates of the main chain atoms of SCRs will be 
transferred to the unknown structure. VRs correspond most often to the loops on the surface of the polypeptide 
and if a loop in the known structure is a good model for the unknown, then the main chain coordinates of the 
known structure may be copied. Side chain coordinates of SCRs and VRs are copied if the residue type in the 

20 unknown is identical to or very similar to that in the known structure. For other side chain coordinates, a side 
chain rotamer library may be used to define the side chain coordinates. When a good model for a loop cannot 
be found fragment databases may be searched for loops in other proteins that may provide a suitable model for 
the unknown. If desired, the loop may then be subjected to conformational searching to identify low energy 
conformers if desired. 

25 Once a homology model has been generated it is analyzed to determine its correctness. A computer 

program available to assist in this analysis is the Protein Health module in Quanta which provides a variety of 
tests. Other programs that provide structure analysis along with output include PROCHECK and 3D-Profiler 
ILuthy R. et al, Nanire 356: 83-85, 1992; and Bowie. J.U. et al. Science 253: 164-170, 1991]. Once any 
irregularities have been resolved, the entire structure may be further refined. Refinement may consist of energy 

30 minimization with restraints, especially for the SCRs. Restraints may be gradually removed for subsequent 
minimizations. Molecular dynamics may also be applied in conjunction with energy minimization. 

Molecular replacement involves applying a known structure to solve the X-ray crystal lographic data 
set of a polypeptide of unknown structure (e.g. native or mutated glycosyltt-ansferases). The method can be 
used to define the phases describing the X-ray diffraction data of a polypeptide of unknown structure v/hcn 

35 only the amplitudes are known. Conunonly used computer software packages for molecular replacement are X- 
PLOR (Brunger 1992, Nature 355: 472-475), AMoRE (Navaza. 1994, Acta Crystallogr. A50: 157-163), the 
CCP4 package (Collaborative Computational Project, Number 4, "The CCP4 Suite: Programs for Protein 
Crystallography", Acta Cryst., Vol. D50, pp. 760-763, 1994), and the MERLOT package (P.M.D. Fitzgerald, 
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J. Appl. Cryst, Vol. 21, pp. 273-278, 1988). It is preferable that the resulting structure not exhibit a root-mean- 
square deviation of more than 3 A. 

Molecular replacement computer programs generally involve the following steps: (1) determining 
the number of molecules in the unit cell and defming the angles between them (self rotation function); (2) 
5 rotating the known structure (e.g. glycosyltransferase) against diffraction data to define the orientation of the 
molecules in the unit cell (rotation fiinction); (3) translating the known structure in three dimensions to 
correctly position the molecules in the unit cell (translation function); (4) determining the phases of the X-ray 
diffraction data and calculating an R-factor calculated from the reference data set and from the new data 
wherein an R-factor between 30-50% indicates that the orientations of the atoms in the unit cell have been 

10 reasonably determined by the method; and (5) optionally, decreasing the R-factor to about 20% by refining the 
new electron density map using iterative refinement techniques known to those skilled in the art (refinement). 

In an embodiment of the invention, a method is provided for determining three dimensional structures 
of polypeptides with unknown structure (e.g. additional native or mutated glycosyltransferases) by applying the 
structural coordinates of a glycosyltransferase structure to provide an X-ray crystal lographic data set for a 

15 polypeptide of unknown structure, and (b) determining a low energy conformation of the resulting structure. 

The structural coordinates of a glycosyltransferase structure may be applied to nuclear magnetic 
resonance (NMR) data to detennine the three dimensional structures of polypeptides (e.g. additional native or 
mutated glycosyltransferases, or polypeptides comprising an SGC domain). (See for example, Wuthrich, 1986, 
John Wiley and Sons, New York: 176-199; Pflugrath et al., 1986, J. Molecular Biology 189: 383-386; Kline et 

20 aL, 1986 J. Molecular Biology 189:377-382). While the secondary structure of a polypeptide may often be 
determined by NMR data, the spatial connections between individual pieces of secondary structure are not as 
readily determined. The structural coordinates of a polypeptide defined by X-ray crystallography can guide the 
NMR spectroscopist to an understanding of the spatial interactions between secondary structural elements in a 
polypeptide of related structure. Information on spatial interactions between secondary structural elements can 

25 greatly simplify Nuclear Overhauser Effect (NOE) data from two-dimensional NMR experiments. In addition, 
applying the structural coordinates after the determination of secondary structure by NMR techniques 
simplifies the assignment of NOE's relating to particular amino acids in the polypeptide sequence and does not 
greatly bias the NMR analysis of polypeptide structure. 

In an embodiment, the invention relates to a method of determining three dimensional structures of 

30 polypeptides with unknovm structures, preferably a native or mutated glycosyltransferases or polypeptides 
comprising an SGC domain, by applying the structural coordinates of a glycosyltransferase structure of the 
invention to nuclear magnetic resonance (NMR) data of the unknown structure. This method comprises the 
steps of: (a) determining the secondary structure of an unknown structure using NMR data; and (b) simplifying 
the assignment of through-space interaaions of amino acids. The term ^ through-space interactions" defines 

35 the orientation of the secondary structural elements in the three dimensional structure and the distances 
between amino acids from different portions of the amino acid sequence. The term ''assignment" defines a 
method of analyzing NMR data and identifying which amino acids give rise to signals in the NMR spectrum. 
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Identification of Modulators orGlycosyltransferases 

Modulators (e.g. inhibitors) of a glycosyltransferase (or a binding site or domain thereof) may be 
designed and identified that may modify the inappropriate activity of a glycosyltransferase involved in a 
clinical disorder. The rational design and identification of modulators of glycosyltransferases can be 
accomplished by utilizing the atomic structural coordinates that define a glycosyltransferase structure, or a part 
diereof Structure-based modulator design identification methods are powerful techniques that can involve 
searches of computer data bases containing a variety of potential modulators and chemical functional groups, 
(See Kuntz et al., 1994, Acc. Chem. Res. 27:117; Guida, 1994, Cuirent Opinion in Struc. Biol. 4: 777; and 
Colman. 1994, Current Opinion m Struc. Biol. 4: 868, for reviews of structure-based drug design and 
identification;and Kuntz et al 1982, J. MoL Biol. 162:269; Kuntz et al., 1994, Acc. Chem. Res. 27: 117; Meng 
et al., 1992, J. Compt. Chem. 13: 505; Bohm, 1994, J. Comp. Aided Molec. Design 8: 623 for methods of 
structure-based modulator design). 

The glycosyltransferase structures, and parts thereof described herein, and the structures of other 
polypeptides determined by the homology modeling, molecular replacement, and NMR techniques described 
herein can also be applied to modulator design and identification methods. 

Modulators of glycosyltransferases may be identified by docking the computer representation of 
compounds from a data base of molecules. Data bases which may be used include ACD (Molecular Designs 
Limited), NCI (National Cancer Institute), CCDC (Cambridge Crystallographic Data Center), CAST 
(Chemical Abstract Service), Derwent (Denvent Information Limited), Maybridge (Maybridge Chemical 
Company Ltd), Aldrich (Aldrich Chemical Company), DOCK (University of California in San Francisco), and 
the Directory of Natural Products (Chapman & Hall). Computer programs such as CONCORD (Tripos 
Associates) or DB-Converter (Molecular Simulations Limited) can be used to conven a data set represented in 
two dimensions to one represented in three dimensions. 

The computer programs may comprise the following steps: 

(a) docking a computer representation of a structure of a compound into a computer representation 
of an active-site (e.g. binding site or SGC domain) of a glycosyltransferase defined in accordance 
with the invention using the computer program, or by interactively moving the representation of 
the compound into the representation of the active-site; 

(b) characterizing the geometry and the complementary interactions fonned between the atoms of the 
active-site and the compound; optionally 

(c) searching libraries for molecular fi^gments which can fit into the empty space between the 
compound and active site and can be linked to the compound; and 

(d) linking the iragments found in (c) to the compound and evaluating the new modified compound. 
Methods are also provided for identifying a potential modulator of a glycosyltransferase function by 

docking a computer representation of a compound with a computer representation of a structure of a 
glycosyltransferase that is defined by the binding sites, atomic interactions, atomic contacts, or atomic 
structural coordinates described herein. In an embodiment the method comprises the following steps: 



wo 00/78936 PCT/CAOO/00725 

-24- 

(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected site (e.g. the sugar nucleotide donor or acceptor 
binding site, or SGC domain) on a glycosyltransferase structure defmed in accordance with 
the invention to obtain a complex; 
5 (b) determining a conformation of the complex with a favourable geometric fit and favourable 

complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of the 
glycosyltransferase. 

"Docking'' refers to a process of placing a compound in close proximity with an active site of a 

10 polypeptide (i.e. a glycosyltransferase), or a process of finding low energy conformations of a 
compound/polypeptide complex (i.e. compound/glycosyltransferase complex). 

Examples of other computer programs that may be used for structure-based modulator design are 
CAVEAT (Bartlett et aL, 1989, in "Chemical and Biological Problems in Molecular Recognition", Roberts, 
S.M, Ley, S.V.; Campbell, N.M. eds; Royal Society of Chemistry: Cambridge, pp 182-196); FLOG (Miller et 

15 aL, 1994, J. Comp. Aided Molec. Design 8:153); PRO Modulator (Clark et aL, 1995 J. Comp. Aided Molec. 
Design 9:13); MCSS (Miranker and Karplus, 1991, Proteins: Structure, Fuction, and Genetics 8:195); and, 
GRID (Goodford, 1985, J. Med. Chem. 28:849). 

In an embodiment of the invention, a method is provided for identifying potential modulators of 
glycosyltransferase function. The method utilizes the structural coordinates of a glycosyltransferase three 

20 dimensional structure, or binding site or domain thereof. The method comprises the steps of (a) generating a 
computer representation of a glycosyltransferase structure, preferably an N-acety!gIucosaminyltransferase 1 
structure, and docking a computer representation of a compound from a computer data base with a computer 
representation of an active site (e.g. sugar nucleotide donor or acceptor binding site) of the glycosyltransferase 
to form a complex; (b) determining a conformation of the complex with a favourable geometric fit or favorable 

25 complementary interactions; and (c) identifying compounds that best fit the glycosyltransferase active-site as 
potential modulators of glycosyltransferase function. The initial glycosyltransferase structure may or may not 
have compounds bound to it. A favourable geometric fit occurs when the surface areas of a compound in a 
compound-glycosyltransferase complex is in close proximity with the surface area of the active-site of the 
glycosyltransferase without forming unfavorable interactions. A favourable complementary interaction occurs 

30 where a compound in a compound-glycosyltransferase complex interacts by hydrophobic, aromatic, ionic, or 
hydrogen donating and accepting forces, with the active-site of a glycosyltransferase without forming 
unfavorable interactions. Unfavourable interactions may be steric hindrance between atoms in the compound 
and atoms in the glycosyltransferase active-site. 

In another embodiment, potential modulators are identified utilizing a glycosyltransferase structure 

35 with or without compounds bound to it. The method comprises the steps of (a) modifying a computer 
representation of a glycosyltransferase (e.g. an N-acetylglucosaminyltransferase 1) having one or more 
compounds bound to it, where the computer representations of the compound or compounds and 
glycosyltransferase are defined by atomic structural coordinates; (b) determining a confonnation of the 
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complex with a favorable geometric fit and favorable complementary interactions; and (c) identifying the 
compounds that best fit the glycosyltransferase active site as potential modulators. A computer representation 
may be modified by deleting or adding a chemical group or groups. Computer representations of the chemical 
groups can be selected from a computer database. 
5 Another way of identifying potential modulators is to modify an existing modulator in a polypeptide 

active-site. The computer representation of modulators can be modified within the computer representation of 
a glycosyltransferase active-site. This technique is described in detail in Molecular Simulations User Manual, 
1995 in LUDI. The computer representation of a modulator may be modified by deleting a chemical group or 
groups, or by adding a chemical group or groups. After each modification to a compound, the atoms of the 

10 modified compound and active-site can be shifted in conformation and the distance between the modulator and 
the active site atoms may be scored on the basis of geometric fit and favourable complementary interactions 
between the molecules. Compounds with favourable scores are potential modulators. 

Compounds designed by modulator building or modulator searching computer programs may be 
screened to identify potential modulators. Examples of such computer programs include programs in the 

15 Molecular Simulations Package (Catalyst), ISIS/HOST, ISIS/BASE, and ISIS/DRAW (Molecular Designs 
Limited), and UNITY (Tripos Associates). A building program may be used to replace computer 
representations of chemical groups in a compound complexed with a glycosyltransferase with groups from a 
computer data base. A searching program may be used to search computer representations of compounds from 
a computer database that have similar three dimensional structures and similar chemical groups as a compound 

20 that binds to a glycosyltransferase. The programs may be operated on the structure of the active-site (e.g. 
binding sites, or SGC domain) of a glycosyltransferase structure, preferably an N- 
acetylglucosaminyltransferase I. 

A typical program may comprise the following steps: 

(a) mapping chemical features of a compound such as by hydrogen bond donors or acceptors, 
25 hydrophobic/lipophilic sites, positively ionizable sites, or negatively ionizable sites; 

(b) adding geometric constraints to selected mapped features; 

(c) searching data bases with the model generated in (b). 

In an embodiment of the invention a method of identifying potential modulators of a 
glycosyltransferase, preferably an N-acetylglucosaminyltransferase I, is provided using the three dimensional 

30 conformation of the glycosyltransferase in various modulator construction or modulator searching computer 
programs on compounds complexed with the glycosyltransferase. The method comprises the steps of (a) 
generating a computer representation of one or more compounds complexed with a glycosyltransferase; (b) (i) 
searching a data base for a compound with a similar geometric structure or similar chemical groups to the 
generated compounds using a computer program that searches computer representations of compounds from a 

35 database that have similar three dimensional structures and similar chemical groups, or (ii) replacing portions 
of the compounds complexed with the glycosyltransferase with similar chemical structures (i.e. nearly identical 
shape and volume) from a database using a compound construction computer program that replaces computer 
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representations of chemica] groups with groups from a computer database, where the representations of the 
compounds are defined by structural coordinates. 

A compound that interacts with a glycosyltransferase or selected binding sites or domains thereof 
identified using a method of the invention may be used as a modulator of any glycosyltransferase or 
5 composition bearing the interacting binding site or domains. Therefore, the invention features a modulator of a 
glycosyltransferase identified by a method of the invention. 



glycosyltransferase comprising the step of using the structural coordinates of a sugar nucleotide donor or 
acceptor or component thereof, or an acceptor or comp>onents thereof, defmed in relation to its spatial 

10 association with a glycosyltransferase structure or a binding site or domain thereof, to generate a compound 
that is capable of associating with the glycosyltransferase or binding site or domain thereof. 

In an embodiment of the invention, a method is provided for designing potential inhibitors of a 
glycosyltransferase comprising the step of using the structural coordinates of uridine, uracil, or UDP listed in 
Table 3 (ATOMS 2828-2835 (uracil); 2836-2844 (ribose); and 2845-2851 (diphosphate)] to generate a 

15 compound for associating with the active site of a glycosyltransferase. The following steps are employed in a 
particular method of the invention: (a) generating a computer representation of uridine, uracil, or UDP, defined 
by its structural coordinates listed in Table 3; (b) searching for molecules in a data base that are structurally or 
chemically similar to the defined uridine, uracil, or UDP, using a searching computer program, or replacing 
portions of the compound with similar chemical structures fi-om a database using a compound building 

20 computer program. 

In another embodiment of the invention, a method is provided for designing potential inhibitors of a 
glycosyltransferase comprising the step of using the structural coordinates of UDP-QlcNAc listed in Table 3 
(ATOMS 2813-2851), to generate a compound for associating with the active site of a glycosyltransferase. 
The following steps are employed in a particular method of the invention: (a) generating a computer 

25 representation of UDP-GlcNAc defined by its structural coordinates listed in Table 3; and (b) searching for 
molecules in a data base that are structurally or chemically similar to the defined UDP-GlcMAc using a 
searching computer program, or replacing portions of the compound with similar chemical structures from a 
database using a compound building computer program. 



30 glycosyltransferase comprising the step of using the structural coordinates of a MansGlcNAc2 acceptor listed in 
Table 4, to generate a compound for associating with the active site of a glycosyltransferase. In Table 4, the 
coordinates of a Man5GlcNAc2 acceptor are listed as ATOMS 3043 through 3126 where the mannose and 
GlcNAc residues designated as X, Y, U, V, W, Z, and A have the following positions in the acceptor : 



The invention further contemplates a method for designing potential inhibitors of a 



In another embodiment of the invention, a method is provided for designing potential inhibitors of a 
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The following steps are employed in a particular method of the invention: (a) generating a computer 
representation of a Man5GlcNAc2 acceptor defined by its structural coordinates listed in Table 4; and (b) 
searching for molecules in a data base that are structurally or chemically similar to the defined MansGlcNAci 
acceptor using a searching computer program, or replacing portions of the compound with similar chemical 
5 structures from a database using a compound building computer program. 

It will be appreciated that a modulator of a glycosyltransferase may be identified by generating an 
actual three-dimensional model of a binding cavity, synthesizing a compound, and examining the components 
to find whether the required interaction occurs. 

Potential modulators of glycosyltransferases identified using the above-described methods may be 
10 prepared using methods described in standard reference sources utilized by those skilled in the art. For 
example, organic compounds may be prepared by organic synthetic methods described in references such as 
March, 1994, Advanced Organic Chemistry; Reactions, Mechanisms, and Structure, New York, McGraw Hill. 

The invention also relates to a potential modulator identified by the methods of the invention. In 
particular, classes of modulators of glycosyltransferases are provided that are based on the three-dimensional 
15 structure of a sugar nucleotide donor, or component thereof, or acceptor, defined in relation to the sugar 
nucleotide donor's or acceptor's spatial association with a glycosyltransferase structure. Modulators of 
glycosyltransferases comprise a compound comprising the structure of uracil, uridine, ribose, pyrophosphate, 
or UDP, and having one or more, preferably all, of the structural coordinates of uracil, uridine, ribose, 
pyrophosphate, or UDP of Table 3 [ATOMS 2828-2835 (uracil); 2836-2844 (ribose); and 2845-2851 
20 (diphosphate)]. In an embodiment, modulators are provided comprising the structure of UDP-GlcNAc and 
having one or more, preferably all, of the structural coordinates of UDP-GlcNAc of Table 3 (ATOMS 2813- 
2851). Functional groups in the uracil, uridine, ribose, pyrophosphate, UDP, or UDP-GlcNAc modulators may 
be substituted with, for example, alkyl, alkoxy, hydroxy 1, aryl, cycloalkyl, alkenyl, alkynyl, thiol, thioalkyl, 
thioaryl, amino, or halo, or they may be modified using techniques known in the art. 
25 Modulators are also contemplated that comprise the structure of a Man5GIcNAc2 acceptor for a 

glycosyltransferase with the structural coordinates of Man^GlcNAci acceptor listed in Table 4 (ATOMS 3043 
through 3126). Functional groups in an acceptor structure may be substituted with, for example, alkyl, alkoxy, 
hydroxyl, aryl, cycloalkyl, alkenyl, alkynyl, thiol, thioalkyl, thioaryl, amino, or halo, or they may be modified 
using techniques known in the art. 
30 The invention contemplates all optical isomers and racemic forms of the modulators of the invention. 

Compositions and Methods of Treatment 

The modulators of the invention may be used to modulate the biological activity of a 
glycosyltransferase in a cell, including modulating a pathway in a cell regulated by the glycosyltransferase or 
modulating a glycosyltransferase with inappropriate activity in a cellular organism. In addition, a 
35 glycosyltransferase structure of the invention may be used to devise protocols to modulate the biological 
activity of a glycosyltransferase in a cell. 

Cellular assays, as well as animal model assays in vivo, may be used to test the activity of a potential 
modulator of a glycosyltransferase as well as diagnose a disease associated with inappropriate 
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glycosyltransferase activity. In vivo assays are also useful for testing the bioactivity of a potential modulator 
designed by the methods of the invention. 

The modulators (e.g. inhibitors) identified using the methods of the invention may be useful in the 
treatment and prophylaxis of tumor growth and metastasis of tumors. Anti-metastatic effects of inhibitors can 
5 be demonstrated using a lung colonization assay. For example, melanoma cells treated with an inhibitor may 
be injected into mice and the ability of the melanoma cells to colonize the lungs of the mice may be examined 
by counting tumor nodules on the lungs after death. Suppression of tumor growth in mice by the inhibitor 
administered orally or intravenously may be examined by measiu-ing tumor volume. 

An inhibitor identified using the invention may have particular application in the prevention of tumor 

10 recurrence afker surgery i.e. as an adjuvant therapy. 

An inhibitor may be especially useful in the treatment of various forms of neoplasia such as 
leukemias, lymphomas, melanomas, adenomas, sarcomas, and carcinomas of solid tissues in patients. In 
particular, inhibitors can be used for treating malignant melanoma, pancreatic cancer, cervico-uterine cancer, 
ovarian cancer, cancer of the kidney such as metastatic renal cell carcinoma, stomach, lung, rectum, breast, 

15 bowel, gastric, liver, thyroid, head and neck cancers such as unresectable head and neck cancers, lymphangitis 
carcinamatosis, cancers of the cervix, breast, salivary gland, leg, tongue, lip, bile duct, pelvis, mediastinum, 
urethra, bronchogenic, bladder, esophagus and colon, non-small cell lung cancer, and Kaposi's Sarcoma which 
is a form of cancer associated with HIV-infected patients with Acquired Immune Deficiency Syndrome 
(AIDS). The inhibitors may also be used for other anti-proliferative conditions such as bacterial and viral 

20 infections, in particular AIDS. 

An mhibitor identified in accordance with the present invention may be used to treat 
immunocompromised subjects. For example, they may be used in a subject infected with HIV, or other viruses 
or infectious agents including bacteria, fungi, and parasites, in a subject undergoing bone marrow transplants, 
and in subjects with chemical or tumor-induced immune suppression. 

25 Inhibitors may be used as hemorestorative agents and in particular to stimulate bone marrow cell 

proliferation, in particular following chemotherapy or radiotherapy. The myeloproliferative activity of an 
inhibitor of the invention may be determined by injecting the inhibitor into mice, sacrificing the mice, 
removing bone marrow cells and measuring the ability of the inhibitor to stimulate bone marrow proliferation 
by directly counting bone marrow cells and by measuring clonogenic progenitor cells in methylcellulose 

30 assays. The inhibitors can also be used as chemoprotectants, and in particular to protect mucosal epithelium 
following chemotherapy. 

An inhibitor identified in accordance with the invention also may be used as an antiviral agent in 
particular on membrane enveloped viruses such as retroviruses, influenza viruses, cytomegaloviruses and 
herpes viruses. An inhibitor may also be used to treat bacterial, fungal, and parasitic infections. For example, a 
35 small molecule inhibitor can be used to prevent or treat infections caused by the following: Neisseria species 
such as Neisseria meningitidis^ and N. gonorrheae; Chlamydia species such as Chlamydia pneumoniae^ 
Chlamydia psittaci^ Chlamydia trichomaiis; Escherichia coli, Haemophilus species such as Haemophilus 
influenza; Yersinia enterocolitica\ Salmonella species such as S.lyphimurium; Shigella species such as Shigella 
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Jlexneriy Streptococcus species such as S.agalactiae and 5. pneumoniae, Bacilllus species such as Bacillus 
subtilis; Branhamella catarrhalis; Borrelia burgdorferi Pseudomonas aeruginosa^ Coxiella burnetti; 
Campylobacter species such as C.hyoilei; Helicobacter pylori; and, Klebsiella species such as Klebsiella 
pneumoniae. 

5 An inhibitor may also be used in the treatment of inflanunatory diseases such as rheumatoid arthritis, 

asthma, inflammatory bowel disease, and atherosclerosis. 

An inhibitor may also be used to augment the anti-cancer effects of agents such as interleukin-2 and 
poly-IC, to augment natural killer and macrophage tumoricidal activity, induce cytokine synthesis and 
secretion, enhance expression of LAK and HLA class I specific antigens; activate protein kinase C, stimulate 

10 bone marrow cell proliferation including hematopoietic progenitor cell proliferation, and increase engraibnent 
efficiency and colony-forming unit activity, to confer protection against chemotherapy and radiation therapy 
(e.g. chemoprotective and radioprotective agents), and to accelerate recovery of bone marrow cellularity 
particularly when used in combination with chemical agents commonly used in the treatment of human diseases 
including cancer and acquired immune deficiency syndrome (AIDS). For example, an inhibitor can be used as 

15 a chemoprotectant in combination with anti-cancer agents including doxorubicin, 5-fluorouracil, 
cyclophosphamide, and methotrexate, and in combination with isoniazid or NSAID. 

The present invention thus provides a method for treating the above-mentioned conditions in a subject 
comprising administering to a subject an effective amount of a modulator of the invention. The invention also 
contemplates a method for stimulating or inhibiting tumor growth or metastasis in a subject comprising 

20 administering to a subject an effective amount of a modulator of the invention. 

The invention still further relates to a pharmaceutical composition which comprises a 
glycosyltransferase structure of the invention or a part thereof (e.g. an active site, a phosphate-binding loop lid, 
an SGC domain, DxD motif,), or a modulator of the invention in an amount effective to regulate one or more 
of the above-mentioned conditions (e.g. tumor growth or metastasis) and a pharmaceutically acceptable carrier, 

25 diluent or excipient. 

The compositions of the invention are administered to subjects in a biologically compatible form 
suitable for phannaceutical administration in vivo. By "biologically compatible form suitable for 
administration in vivo" is meant a form of the active ingredient to be administered in which any toxic effects 
aiB outweighed by the therapeutic effects of the active ingredient. The term "subject" is intended to include 

30 mammals and includes humans, dogs, cats, mice, rats, and transgenic species thereof. Administration of a 
therapeutically active amount of the pharmaceutical compositions of the present invention is defined as an 
amount effective, at dosages and for periods of time necessary to achieve the desired result. For example, a 
therapeutically active amount of a modulator of the invention may vary according to factors such as the 
condition, age, sex, and weight of the individual. Dosage regimes may be adjusted to provide the optimum 

35 therapeutic response. For example, several divided doses may be administered daily or the dose may be 
proportionally reduced as indicated by the exigencies of the therapeutic situation. 
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The active compound may be administered in a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal application, or intracerebral 
administration. 

A pharmaceutical composition of the invention can be administered to a subject in an appropriate 
5 carrier or diluent, co-administered with en2yme inhibitors or in an appropriate carrier such as microporous or 
solid beads or liposomes. The term "pharmaceutically acceptable cairier" as used herein is intended to include 
diluents such as saline and aqueous buffer solutions. Liposomes include water-in-oil-in-water emulsions as 
well as conventional liposomes (Strejan et al., (1984) J. Neuroimmunol 7:27). The active compound may also 
be administered parenterally or intraperitoneally. Dispersions can also be prepared in glycerol, liquid 
10 polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these 
preparations may contain a preservative to prevent the growth of microorganisms. Depending on the route of 
administration, the active compound may be coated to protect the compound from the action of enzymes, acids, 
and other natural conditions which may inactivate the compound. 

Therapeutic administration of polypeptide modulators may also be accomplished using gene therapy. 
IS A nucleic acid including a promoter operatively linked to a heterologous polypeptide may be used to produce 
high-level expression of the polypeptide in cells transfected with the nucleic acid. DNA or isolated nucleic 
acids may be introduced into cells of a subject by conventional nucleic acid delivery systems. Suitable delivery 
systems include liposomes, naked DNA, and receptor-mediated delivery systems, and viral vectors such as 
retroviruses, herpes viruses, and adenoviruses. 
20 The following non-limiting examples are illustrative of the present invention: 

EXAMPLE 1 

Crystals of aipha-l,3-mannosyl-glycoprotein beta-l,2-N-acetyiglucosaminyltransferase (GnT-1) were 
25 grown by the vapour-diffiision method from protein drops containing 10 mg/ml GnT-1, 10 mM MES buffer, 
pH 5.5, 270 mM KCL, 2-5 mM MnCla, and 10 mM UDP-GlcNAc, mixed with, and equilibrated against, 15- 
25% polyethylene glycol 8000, 100 mM Tris buffer, pH 7.9, 0 to 5% glycerol, and 0 to 10% isopropanol. 
Plate-like crystals grew within a few days, in space group P2i2i2i (a= 40.4 A, b= 82.4A, c= 102.5A, a=P=Y 
=90**), with one molecule in the asynmietric unit, and 40% solvent content.Data was collected from the crystals 
30 flash-frozen in a lOOK N2 stream, after a ten-minute wash with 21% polyethylene glycol 8000, 15% glycerol, 
and 100 mM Tris buffer, pH 7.9. 

Atomic structural coordinates of an N-acetylglucosaminyltransferase I are set out in Table 1. Atomic 
coordinates of an N-acetylglucosaminyltransferase I with bound MeHg are set out in Table 2. The atomic 
structural coordinates of a rabbit N-acetylglucosaminyltransferase I bound to UDP-GlcNAc and a manganese 
35 2+ ion are shown in Table 3. Atomic structural coordinates of an N-acetylglucosaminyltransferase I with 
acceptor are shown in Table 4. Figures 1 to 26, 28 to 30, and 32 to 40B illustrate glycosyltransferase 
structures, or binding sites or domains thereof 
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EXAMPLES 

The x-ray crystal structure of a soluble fragment containing the catalytic domain of a rabbit 
5 (Oryctolagus cunicuius) GnT I was determined at 1 .4 A resolution. The 342 residue catalytic domain of GnT I 
was expressed as an N-terminal histidine-tagged fusion protein (Sarker et al, Glycoconjugate J. 15:193-197, 
1998), using the baculovirus/Sf9 system in a 3.5 litre bioreactor. The protein was purified using a CM HyperD 
F column, followed by a Ni-affmity column. The histidine tag was removed by enterokinase cleavage. Crystals 
were grown using the hanging-drop vapor diffusion method, from drops containing 10 mgfml protein, 10 mM 

10 MES pH 6.5, 250 mM KCL. 2 mM MnClj, and 10 mM UDP-GlcNAc, and wells containing 17.5%.19.5% 
PEG 8000, 5% glycerol, and 100 mM Tris-HCl pH 7.9 Native and two-wavelength mercuiy-derivative data 
were collected using frozen crystals on the F2 beam line at the Cornell High Energy Synchrotron Source. The 
crystals grow in space group P2i2,2|, with cell parameters a = 40.4 A, b=82.4 A, c = 102.5 A. The structure 
was solved using the multiwavelength anomalous dispersion technique. GnTi contains both an eight-stranded 

15 mixed beta-sheet, flanked by six alpha helices, and a four-stranded mixed beta sheet, backed by three alpha 
helices. The structure reveals that the catalytic domain has dimensions 54 A x 52 A x 37 A, with a large pocket 
on one face capable of holding both the UDP-GlcNAc donor and the MansGn2 acceptor. Sequence comparison 
shows that residues found in the pocket are very well conserved among GnT I sequences from different 
species. The pocket is flanked by a loop, not seen in the electron density map, which plays a role in either 

20 catalysis or substrate binding. 

EXAMPLE 3 

X^rqy Crystal Structure of N-Acetylglucosaminyltransferase I: Structure, Mechanism, and the SGC 
25 Superfamily 

Overall Structure 

The catalytic fragment of rabbit GnT 1 (residues 106-447; Sarkar et al, 1998) was crystallized in the 
presence of UDP-GlcNAc and Mn^^, and solved by the multi-wavelength anomalous diffraction (MAD) 
phasing method using a methyhnercury chloride derivative (Table 6). In particular, crystals were grown using 

30 the hanging drop vapor diffusion method, by mixing equal 1 .5 volumes of protein solution (10 mg/ml GnT I 
catalytic fragment, 10 mM MES buffer, pH 5.5, 270 mM KCl, 2 mM MnCl2 and 10 mM UDP-GlcNAc) with 
well solution (15-25% polyethylene glycol 8000, 100 mM Tris buffer, pH 7.9, and 5% glycerol), and 
equilibrating against I ml of the well solution. A mercury derivative was obtained by soaking a crystal in well 
solution containing 20 mM MeHgCl. All data was collected using Quantum 4 charge-coupled device detectors 

35 on the F2 beamline of the Cornell High Energy Synchrotron Source, using crystals flash-frozen in the 100 K 
N2 stream. Data were integrated, scaled, and reduced with DENZO and SCALEPACK (Otwinowski and 
Minor, 1997). The mercury position was identified with SOLVE (Tenvilliger and Berendzen, 1999), and 
refined using SHARP (La Fortelle and Bricogne, 1997). Solvent flattening and histogram matching were 
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performed using DM (Cowtan, 1994). The resultant experimental map was traced using the program O (Jones 
et al, 1991), and the model refmed with multiple rounds of manual rebuilding using O and OOPS2 (Kleywegt 
and Jones, 1996), alternated with simulated annealing and positional and B-factor refinement using CNS 
(Brunger et ak, 1998). The initial model was also refined against the "native" and "complex" data in a similar 
5 fashion. 

In total, the structure was refined against data sets from 3 different crystals. Of these, no bound 
nucleotide sugar or Mn^^ ion was observed in the mercury "derivative" (refined at 1 .4 A resolution) nor in the 
structure refined against the data set that was termed "native" (1.5 A resolution). Unlike that found in these apo 
structures, both components were seen in the "complex" (1.8 A resolution). Since the native and derivative 

10 data sets were collected on samples that had aged before the x-ray data were collected, it was assumed that in 
these cases the UDP-GlcNAc had been hydro lysed. 

GnT I is a two-domain protein, with overall dimensions of approximately 65 A x 40 A x 50 A (Figure 
25). The N-terminal domain (domain 1 : residues 1 06-3 1 7) is an eight-stranded mixed p-sheet (p 1 -p8), flanked 
by six a-helices (al-a6) and a small two-stranded antiparallel p-sheet (P4* and pS*). 'Hie smaller C-terminal 

15 domain (domain 2: residues 354-447) is a four-stranded mixed p-sheet (p9, plO, pi3 and P14), flanked by 
three a-helices (a7-a9) and a short P-finger (P 1 1 and p 12), The two domains are connected by a linker region 
(residues 331 to 353) which wraps halfway around domain 1 before starting the first helix of domain 2. The 
—1050 A^ interface between domain 1 and domain 2 is quite hydrophilic, and contains 20 bridging water 
molecules. The residues buried in the interface on domain 1 are 53% polar, while those in domain 2 are 36% 

20 polar. 

The a-helices a3, a5 and a6 sit on '^op" of the central p-sheet and create a pocket for the nucleotide- 
sugar and oligosaccharide acceptor. Electrostatic potential analysis shows that this pocket is largely acidic, in 
contrast to the rest of the protein surface, which is primarily positively charged. The nucleotide sugar itself sits 
between helicies a3 and a6 and p-strands Pl, P2 and P4. The topology and structure of p-strands pi to p4, 

25 and helices al to a3, are similar to those of the corresponding elements in domains possessing the Rossman 
fold, however, the orientation of the nucleotide sugar with respect to these elements is not. 

In the native and derivative structures, in which UDP-GlcNAc and the Mn^* ion were not observed, 
there is also no electron density for the 13-residue loop (residues 318-330) adjacent to the nucleotide-sugar 
binding site. The "missing loop" is presumed to be disordered in these crystals, as SDS-PAGE analysis of 

30 washed crystals showed the protein to be intact. These residues are structured in the complex, and are found to 
form a flap that partially covers the UDP-GlcNAc moiety. Although structured by UDP-GlcNAc binding only 
the tip of the loop makes direct interactions with it. Approximately 50 A^ is buried between the tip of the loop 
and the UDP-GlcNAc phosphates. Structuring the loop also buries -600 A^ of protein surface adjacent to the 
nucleotide-sugar binding site. In these crystals the active site and the loop itself are exposed to a large solvent 

35 channel, and are not involved in crystal contacts. Aside from structuring the loop, there is no major 
conformational change associated with UDP-GlcNAc binding. The native and complex structures show a root- 
mean-squared-deviation (rmsd) of 0.28 A, based on the a-carbon atoms of residues 106 to 317 and 331 to 447. 
The Nucleotide Sugar and Metal Binding Sites 
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As shown by the structure of the complex, the UDP-GlcNAc is bound in the ami conformation. The 
uracil ring is sandwiched between 1187 and the C115-C145 cysteine bridge, and its N3 and 02 make key 
hydrogen bond interactions with D144 and HI 90, respectively (Table 7 and Figure 33 A). Moreover, its C5 is 
in van der Waals contact with V321, part of the loop structured by UDP-GlcNAc binding. The ribose 02* and 
5 03' atoms make a water-mediated and a direct hydrogen bond, respectively, with the carboxyl side chain of 
D212. This Asp is the middle residue in the DXD motif common to a number of glycosyltransferases, as will 
be discussed in detail below. 

The Mn^* ion shows an octahedral geometry coordinated by six "inner-sphere" oxygen atoms (Table 
7 and Figure 33B). The a- and P-phosphate of UDP-GlcNAc each contribute a coordinating oxygen atom, as 

10 do three water molecules. These water molecules are, in turn, hydrogen bonded to "outer sphere"' protein 
residues E21 1, D213, T315 and G3I7. The remaining high-energy inner sphere metal ligand is provided by 
the carboxyl group of D213 - the only direct interaction with the protein. As such, it seems that GnT I does 
not have an independent metal binding site capable of binding Mn^* in the absence of UDP-GlcNAc. In 
addition to coordinating the Mn^* ion, the phosphates make direct interactions with the protein. The a- 

15 phosphate makes a salt bridge with R117 and a hydrogen bond to the amide nitrogen of V321, and the P- 
phosphate hydrogen bonds to the hydroxyl group of S322. These interactions with V321 and S322 are an 
important component of the UDP-GlcNAc dependent structuring of the loop. Overall, the phosphates are in a 
conformation typical of divalent metal-bound nucleotides (Black et al, 1994). 

Finally, the GlcNAc moiety itself makes several interactions with the protein (Table 7 ind Figure 

20 33C). The vicinal 03 and 04 hydroxyls are hydrogen bonded with the carboxyl group of £21 1 in a fashion 
seen in many lectin-carbohydrate complexes (Vyas, 1991). The 04 hydroxyl appears to play a central role, as 
it also makes a strong hydrogen bond with W290. The 06 hydroxyl is hydrogen bonded to a tightly bound 
water molecule seen in both the apo and complex structures, van der Waals interactions are also important, 
most notably between the A^-acetyl methyl group and the side chains of L269 and L331. 

25 The Giycosyltransferase DxD Motif 

The DxD motif has been identified in many giycosyltransferase families and is thought to be involved 
in Mn^ ion binding. The motif contains two Asp residues and is typically flanked by apolar residues 
(hhhhDxDxh) (Wiggins and Munro, 1998). (See Figures 27 and 31 for DxD motif alignments.) Site-directed 
mutagenesis has shown that both Asp's are required for yeast a-l,3-mannosyltransferase activity (Wiggins and 

30 Munro, 1998). In GnT I, the motif is present in a modified form (^"EDD^"), and with L214 forms the i to i+3 
residues of a type 1 p-tum connecting P-strands P4 and p4' (Figure 39). As such, the highly conserved acidic 
residues are directed toward the same face of the turn. The fact that P4 runs through the core of the protein is 
consistent with the observed presence of several apolar residues on the N-terminal side of the motif. 

The interactions with UDP-GlcNAc and the Mn^* ion illustrate the importance of the motif. As 

35 discussed above, the second conserved Asp (D213) makes the only direct interaction with the bound Mn^* ion. 
In addition, it makes a hydrogen bond with one of the metal coordinating water molecules, which itself is 
hydrogen bonded to the first conserved Asp (E21 1). Overall, these residues are conformational ly constrained 
by the well-defined octahedral geometry characteristic of Mn'* ion coordination. Since the phosphates of the 
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nucieotide sugar also coordinate the manganese ion, it serves to define the relative orientation of the nucleotide 
sugar and the conserved acid residues. In the case of GnT I this positions the GlcNAc moiety of the donor 
sugar for interaction with the first residue of the motif. In other sugar nucleoside diphosphate/Mn^^-dependent 
glycosyltransferases, the first Asp of the DxD motif would be expected to play a carbohydrate-binding role, 
5 regardless of the nucleotide sugar type/linkage. It is well known that Asp is a key residue in carbohydrate 
binding proteins, and is thus well suited for such a role. Clearly, the well-conserved DxD motif does not 
simply serve to bind metal, but rather coordinates both the Ma^* ion and the sugar moiety of the donor. 
Reaction Mechanism 

Catalysis by inverting glycosyltransferases is believed to involve a general base, such as Asp or Glu, 
10 which serves to assist in the deprotonation of the nucleophilic hydroxyl of the acceptor. In GnT I the only 
residue capable of playing this role is D291, 4.7 A away from the GlcNAc CI (Figure 33C). The structure 
shows that the acceptor will be able to approach the UDP-GlcNAc donor, so as to permit in-line nucleophilic 
attack and inversion of stereochemistry at the GlcNAc CI. Furthermore, the Mn^^ ion is disposed to pull 
developing negative charge away from the P-phosphate of the UDP leaving group (a role which may be aided 
15 by Rl 17) (in which the hydration state of the ion is likely to play a crucial role (Cowan, 1998; Dudev et al, 
1999). 

Mechanistically, the reaction is thought to involve an oxocarbenium ion intermediate, similar to that 
proposed for glycosidases. Since glycosidases reduce the activation energy of the hydrolysis reaction by 
binding their substrates in a distorted conformation the GlcNAc ring conformation was examined for a similar 

20 effect. However, there was no evidence of significant distortion suggesting that the UDP-GicNAc is bound in 
a low energy conformation: the sugar ring is a standard ^Ci chair, and the glycosydic linkage is in an allowed 
conformation (Petrova et al 1999). As such, the UDP-GlcNAc is conceivably no more susceptible to 
nucleophilic attack by water than it would be in solution. Presumably, the activation energy for catalysis is 
derived from acceptor binding. 

25 Loop Structuring and the Acceptor Binding Pocket 

Comparison of the apo and complex structures shows that UDP-GlcNAc binding structures the 318- 
330 loop, forming a flap that partly covers the UDP-GlcNAc (Figure 40A). As discussed above V321 and 
S322, at the tip of the loop, make hydrogen bonds to the a- and P-phosphates of the UDP-GlcNAc. Residues 
320-323 form a type IV turn, while the C-terminal residues 324-330 make one complete turn of an a-helix. 

30 The loop folds upon itself, burying residue F327 against R318 and the non-loop residues T315, L331 and 
K332. The only conformational changes other than structuring the loop itself are a peptide flip (F316-G317) 
and a reorientation of the T315 side chain. These changes are critical as the G3 17 carbonyl and the T315 
hydroxyl are repositioned to make hydrogen bonds with two of the Mn^'^ ion coordinating water molecules (see 
Figure 40A and Figure 33B). 

35 As shown in Figure 403, loop structuring creates a deep pocket, terminating over the proposed 

catalytic base (D291) and the GlcNAc moiety. The pocket itself can accommodate only a single 
monosaccharide residue of the Man5GlcNAc2 acceptor. One complete side of this pocket is formed by the loop 
structured upon UDP-GlcNAc binding. As a result two loop residues (S322 and F326), fiilly conserved among 
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active GnT I sequences, are presented to the acceptor binding pocket (Figure 40B). To explore the potential 
roles played by these and other residues in the binding pocket, a mannose residue was modeled into the site. 
With the attacking 02 hydroxyl positioned between the Asp 291 OE2 and the UDP-GlcNAc CI, only one 
general orientation leads to reasonable steric and chemical interactions with the protein. In this orientation, the 
5 exocyclic C6 hydroxymethyl group of the mannose interacts with S322 and F326, while the 03 and 04 point 
toward D291, R295 and R415. 

The importance of the mannose 03, 04 and 06 predicted by this model is consistent with substrate 
studies using synthetic analogues of the trimannose core of the acceptor (Moller, 1992; Reck, 1995). In these 
studies it was further shown that even in the trimannose core, the known specificity of GnT I for the Manal,3- 

10 arm over that of the Manal,6-ann of the acceptor is preserved. This specificity is presumably dictated by 
interactions involving the P-mannose 04, the only other trisaccharide hydroxyl group found to be important. 
Extending the model to include all residues of the trimannose core (in its solution conformation) (Brisson and 
Cowen, 1983), the P-mannose 04 is positioned to interact with either D291 or D292. Similar interactions 
cannot be made when the 6-arm mannose is positioned in the binding pocket. Presumably the incoming 

15 nucleophile and associated binding energy serve to drive the reaction toward the transition state and ultimately 
product formation. 
Enzyme Kinetics 

Analysis has shown that GnT I proceeds through an ordered sequential Hi Hi kinetic mechanism 
(Nishikawa et al, 1988). The enzyme first binds Mn^'^AJDP-GlcNAc and then the Man5GicNAc2 acceptor; the 

20 carbohydrate product is released first, followed by UDP. The GnT I structures provide an explanation for 
these observations. Since UDP-GlcNAc binding is required to structure the loop, and create the acceptor 
binding site, it is clear that the nucleotide sugar must bind first. Once catalysis has occurred, the UDP product 
cannot maintain the loop in its structured conformation, the acceptor binding pocket is destroyed, and the 
oligosaccharide product released. UDP, which is bound more weakly to GnT I than UDP-GlcNAc, is then free 

25 to diffuse out of the binding site, to be replaced by a fresh molecule of UDP-GlcNAc. By destroying the 
acceptor/product binding pocket, these kinetics also ensure that the enzyme is not strongly inhibited by the 
oligosaccharide product. 

The structure also shows that GnT I does not itself have a Mn^* ion binding site — there is only a 
single direct protein-metal interaction. The Mn"* ion is clearly more fully coordinated by UDP-GIcNAc, and 

30 positioned on the surface of the protein by virtue of its interactions with the nucleotide sugar. This mode of 
binding may also be an important determinant of how the enzyme releases its products. In the absence of an 
independent metal binding site, the UDP-Mn^"^ complex would be free to dissociate from the enzyme surface, 
once catalysis has occurred. 

The suggestion that bound UDP cannot support loop structuring stems from an analysis of the loop's 

35 interactions with UDP-GlcNAc in the complex. As discussed earlier, two residues (V321 and S322) at the tip 
of the loop form hydrogen bonds with oxygen atoms from the two phosphates. The loop's interactions are not 
otherwise very extensive, altogether burying only 50 A* of the bound nucleotide sugar. Once the bond 
between the GlcNAc CI and the P-phosphate oxygen is broken, the terminal phosphate acquires an additional 
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negative charge and presumably greater mobility (the latter enhanced by the lack of an independent Mn^" ion 
binding site). Together, these effects would be expected to disrupt the ability of the phosphates to structure the 
loop. As such, it would seem that the structured loop can be thought of as a sensor for the integrity of the 
GlcNAc-phosphate linkage, thereby regulating formation and destruction of the acceptor/product binding site. 
5 The SGC Domain 

Analysis shows that the structure of domain 1 of GnT I is very similar to diat of the B. subtilis 
glycosyltransferase spsA (residues 2-217) (Chamock and Davies, 1999). It possesses an identical topology, 
and all of the major secondary structural elements characterizing the domain are found in both structures 
(Figure 29). The domain is also found, with some modification in secondary structure (the topology remains 

10 the same), in p4Gal-Tl (residues 180-346) (Gastinel et al, 1999), and GImU (residues 4-227, Figure 30) 
(Brown et al, 1999). Structural alignment using the program DALI, yields Z-scores of 15.7, 10.6 and 9.8, with 
spsA, p4GaI-Tl and GlmU, respectively. The very strong structural similarity between GnT I domain 1 and 
spsA suggests the existence of a canonical core domain, the SGC domain (spsA GnT I core domain), 
represented, in these four structures. 

15 Despite the structural similarity shown by these enzymes, they do not show significant sequence 

similarity. Even with a knowledge of the structural alignment, GnT I shows only 10%, 12%, and 7% sequence 
identity with spsA, p4Gal-Tl, and GlmU, respectively. These levels of identity make it difficult, if not 
impossible, to establish whether or not these enzymes stem from a common ancestor. Analysis of residues 
critical for function may, however, shed light on this question, '"he position of the UDP moiety in the GnT I 

20 complex is virtually identical to that found in the spsA complex (Figure 29) and is also very similar to that seen 
in the p4Gal-Tl and GhnU complexes. Moreover, the DxD motif is present in all four of these proteins and 
forms a perfectly superimposable type 1 p-tum in each case. Finally, at position D291, the proposed catalytic 
base in GnT I, both glycosyltransferases, spsA (D191) and P4Gal-Tl (D318), also possess Asp residues. Not 
only are these key residues and functional features identical in these structures, they are found at the same 

25 position on the structural/topological framework. The low sequence identity, conunon fold, and related 
functional features define the SGC superfamily, whose members are therefore likely to share a conunon 
evolutionary origin (Murzin et al, 1995). 
The SGC Superfamily 

The lack of sequence identity between glycosyltransferases with different specificities has lead to a 
30 classification that now includes 44 glycosyltransferase families. GnT 1, for example, is in a family of its own, 
and a Position-Specific Iterated BLAST (PSI-BLAST) search, using the GnT 1 sequence, identifies no other 
related glycosyltransferases. Based on the knowledge that the GnT I SGC domain is structurally similar to 
spsA, an attempt was made at finding sequence similarity between these and other glycosyltransferases, 
thereby extending the SGC superfamily. The spsA sequence, coming from a much larger glycosyltransferase 
35 family, containing many divergent sequences, provides a more robust profile, and it was used to seed a PSI- 
BLAST search (Altschul et al, 1997). The search was able to identify similarity between spsA (family 2) and 
rabbit GnT I (family 13). It also showed similarity between spsA and the p-l,4-GaINAc transferases (family 
12), the ceramide glucosyltransferases (family 21) and the polypeptide GalNAc transferases (family 27); 
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neither p4Gal-Tl (family 7), nor GimU appeared in the searches. 

To further explore possible relationships among the glycosyltransferase families, protein threading 
was used to determine the compatibility of a number of glycosyltransferase sequences with the SGC domain. 
Using the program THREADER 2, a single arbitrarily-selected sequence from each of the 27 
5 glycosyltransferase families described by Campbell et ai (Campbell et al, 1998; Campbell et al, 1997) were 
run against a database of 1900 structures, which included the SGC domain of GnT I, spsA, p4Gal-Tl and 
GhnU. In both the normal and randomized test scores, the selected sequence from family 2, family 7, and 
family 13 ranked frrst or second against the SGC domain of spsA, p4GaI-Tl and GnT I, respectively, as would 
be expected. The sequence from family 3, family 6, family 16 and family 26 also ranked first or second in the 

10 two tests; sequences from several other families also received high scores. These results, and those based on 
PSI-BLAST searching, suggest that the SGC domain is widely represented among different families and 
includes both inverting and retaining glycosyltransferases. 

Table 8 shows protein threading results. Proteins from different families were threaded against a 
THREADER 2 database containing 1900 protein folds, including GnT I, spsA, GImU, and p4Gal-TL The 

15 folds were sorted on the basis of their filtered combined energy Z-scores. When a GTCD-1 -containing fold was 
one of the top thirty hits, out of 1900, then the top thirty hits were rerun with a randomization test of fifty 
shuffled-sequence threadings for each fold, to give a combined energy shuffled Z-score. A correct prediction 
should score well in both tests. Note that not only are inverting families represented, but so are retaining 
glycosyltransferases. 

20 Conclusion 

The structure of the catalytic domain of GnT 1 has provided the basis for its Mn^*AJDP-GlcNAc 
binding properties, as well as insight into both its catalytic and kinetic mechanisms. The structure of the DxD 
motif shows that the first conserved residue plays a role in binding the donor sugar, while the second 
coordinates the essential Mn^* ion. These roles are likely to be conserved in other DxD-containing 

25 glycosyltransferases, regardless of donor specificity. In addition, structural analysis has defined the SGC 
domain, seen in GnT 1, spsA, p4Gal-Tl and GlmU. Sequence analysis and protein threading show that the 
SGC domain is contained in enzymes from several of the existing inverting and retaining glycosyltransferase 
families. Among these are enzymes involved in manmialian and O-linked oligosaccharide biosynthesis, 
bacterial cell wall production, and the synthesis of glycogen, chitin and cellulose. Together, they constitute the 

30 SGC superfamily. 

Having illustrated and described the principles of the invention in a preferred embodiment, it should 
be appreciated to those skilled in the art that the invention can be modified in arrangement and detail without 
35 departure from such principles. All modifications coming within the scope of the following claims are claimed. 

All publications, patents and patent applications referred to herein are incorporated by reference m 
their entirety to the same extent as if each individual publication, patent or patent application was specifically 
and individually indicated to be incorporated by reference in its entirety, in particular, U.S. provisional patent 
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applications Serial Nos. 60/139.949, filed June 18, 1999, 60/161.809. filed October 27, 1999. 60/178,401, 
filed January 27, 2000, and 60/202,509 filed May 5, 2000 are incorporated herein by reference. 
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Table 1 

REMARK GnT-1 native structure, "gntlg" 
REMARK Ulug Onligil, 1999 06 14 

REMARK coordinates from restrained individual B-factor refinement 
REMARK refinement resolution: 500.0 - 1.5 A 
REMARK starting r= .2186 free_r= .2322 
REMARK final r= .1991 free_r= .2154 

REMARK B rmsd for bonded mainchain atoms«= .7 96 target= 1.5 
REMARK B rmsd for bonded sidechain atoms= 1.517 target= 2.0 
REMARK B rmsd for angle mainchain atoms= 1.237 target= 2.0 
REMARK B rmsd for angle sidechain atoms= 2.317 target= 2.5 
REMARK wa= .685709 
REMARK rweight=. 167519 
REMARK target= mlf steps^ 60 

REMARK sg= P2 ( 1 ) 2 (1) 2 (1 ) a= 40.478 b= 82.423 c- 102.480 alpha= 90 beta= 
90 gamma= 90 

REMARK parameter file 1 : CNS_TOPPAR:protein_rep.param 
REMARK parameter file 2 : CNS__TOPPAR: water_rep . param 
REMARK molecular structure file: generate_easy .mtf 
REMARK input coordinates: bgroup. ann.pdb 
REMARK reflection file= . . /data/gntlg_start . cv 
REMARK nc£= none 

REMARK B-correction resolution: 6.0 - 1.5 
REMARK initial B-factor correction applied to f_w3 : 
REMARK 311= -.092 B22= 1.661 B33= -1.569 
REMARK B12= .000 B13= .000 B23= .000 

REMARK B-factor correction applied to coordinate array B: -.314 
REMARK bulk solvent: density levei= .380844 e/A"3, B-factor= 35.5223 A"2 
REMARK reflections with | Fobs | /sigma_F < 0.0 rejected 
REMTIRK reflections with I Fobs I > 10000 * rms(Fobs) rejected 
REMARK anomalous diffraction data was input 

REMARK theoretical total number of refl. in resol. range: 106027 ( 
100.0 % ) 

REMARK number of unobserved reflections (no entry or |F|=0): 6093 ( 
5.7 % ) 

REMARK number of reflections rejected: 0 ( 

.0 % ) 

REMARK total number of reflections used: 99934 ( 

94,3 % ) 

REMARK number of reflections in working set: 95035 ( 

89.6 % ) 

REMARK number of reflections in test set: 4899 ( 

4.6 % ) 

REMARK FILENAME="bindividual . ann.pdb" 

REMARK DATE: 14-Jun-99 15:30:36 created by user: ulu 

REMARK VERSION: 0.5 
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Table 2 

REMARK GnT-1 structure with MeHg bound, "gntlf" 
REMARK Ulug Unligil, 1999 06 11 

REMARK coordinates from restrained individual B-factor refinement 
REMARK refinement resolution: 500,0 - 1.5 A 
REMARK starting r= .2545 free_r« .2672 
REMARK final r= .2369 free_r= .2501 

REMARK B rmsd for bonded mainchain atoms= .84 0 targets 1.5 
REMARK B rmsd for bonded sidechain atoms= 1.595 target^ 2.0 
REMARK B rmsd for angle mainchain atoms= 1.299 target- 2.0 
REMARK B rmsd for angle sidechain atoms= 2.451 target= 2.5 
REMARK wa= .901697 
REMARK rweight=. 157734 
REMARK target^ mlf steps= 30 

REMARK sg= P2 (1) 2 (1) 2 (1) a= 40.382 b= 82.378 c- 102.487 alpha= 90 beta« 
90 gamma= 90 

REMARK parameter file 1 : CNS_TOPPAR: protein_rep . param 

REMARK parameter file 2 : . . /data/mmc.param 

REMARK parameter file 3 : CNS_TOPPAR: water_rep .param 

REMARK molecular structure file: generate_easy .mtf 

REMARK input coordinates: bgroup . ann . pdb 

REMARK reflection file= . . /data/gntl_start . cv 

REMARK ncs'= none 

REMARK B-correction resolution: 6.0 - 1.5 
REMARK initial B-factor correction applied to f_wl : 
REMARK Bll= -.069 322= 1.877 B33= -1.809 
REMARK 812= .000 B13= .000 B23- .000 

REMARK B-factor correction applied to coordinate array B: -.7 60 
REMARK bulk solvent: density level= .377577 e/A''3, B-factor= 29.956 A'^2 
REMARK reflections with j Fobs | /sigma_F < 0.0 rejected 
REMARK reflections with | Fobs I > 10000 * rms (Fobs) rejected 
REMARK anomalous diffraction data was input 

REMARK theoretical total number of refl. in resol. range: 10574 6 ( 
100.0 % ) 

REMARK number of unobserved reflections (no entry or |F|=0): 22053 ( 
20.9 % ) 

REMARK number of reflections rejected: 0 ( 

.0 % ) 

REMARK total number of reflections used: 83693 { 

79.1 % ) 

REMARK number of reflections in working set: 79589 ( 

75.3 % ) 

REMARK number of reflections in test set: 4104 ( 

3.9 % ) 

REMARK FILENAME="bindividual . ann . pdb" 

REMARK DATE: ll-Jun-99 11:49:39 created by user: ulu 

REMARK VERSION: 0.5 
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GnT I "be" Structure of rabbit GnT I bound to UDP-GlcNAc and a 
Manganese 2+ ion. Ulug Unligil & Dr. James Rini, Oct 25, 1999 
coordinates from minimization refinement 
refinement resolution: 50.0 - 1-8 A 
starting r« 0.2006 free_r= 0.2388 
final r= 0.1987 free_r= 0.2388 
rmsd bonds= 0.006698 rmsd angles= 1.36297 
wa= 1 - 4 

target= mlf cycles= 1 steps= 200 

sg= P2 (1)2 (1)2(1) a= 40,541 b^ 82,190 c= 101.956 alpha= 90 beta= 90 gamma= 



parameter file 1 

parameter file 2 

parameter file 3 

parameter file 4 

parameter file 5 



Table 3 

REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
90 

REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
CRYSTl 
REMARK 
REMARK DATE:22'Oct-1999 



CNS_TOPPAR : protein_rep . par am 
CNS_TOPPAR: ion . pa ram 
. . / . . /data/udpglcnac . param 
. . / . . /data/glycerol . param 
CNS_TOPPAR: water_rep . param 
molecular structure file: /alternate .mtf 
input coordinates: bindividual.bi4 -lO.pdb 
reflection fil€= , . /. . /data/gntlbe.cv 
ncs— none 

B-correction resolution: 6.0 - 1-8 
initial B-factor correction applied to fobs : 
Bll= 4.245 B22= 1.052 B33= -5.296 
B12= 0.000 B13^ 0.000 B23= 0,000 
B- factor correction applied to coordinate array B: -1.075 
bulk solvent: density level= 0.415966 e/A'^S, B-factor- 55.91 A'^2 
reflections with | Fobs I /sigma_F < 0.0 rejected 
reflections with I Fobs I > 10000 * rms(Fobs) rejected 
anomalous diffraction data was input 

theoretical total number of refl- in resol. range: 61022 ( 100,0 % ) 

nimiber of unobserved reflections (no entry or IF|=0): lfll03 < 29-7 % ) 
number of reflections rejected: 0 ( 0-0 % ) 

total number of reflections used: 42919 ( 70.3 % ) 

number of reflections in working set: 407 i3 ( 66.8 % ) 

number of reflections in test set: 2176 ( 3.6 % ) 

40.541 82.190 101.956 90.00 90.00 90.00 P 21 21 21 
FILEN7^E="minimize.bi4 . 14 .pdb" 
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31.804 23.608 1.00 32.60 S 

7.079 46.829 1.00 32.34 S 

1.522 7.763 1.00 36.53 S 

22.435 42.612 1.00 20.83 S 

33.613 4.881 1.00 26.21 S 

15.957 16,082 1.00 28.40 S 

21.698 46.859 1.00 30.72 S 

-0.197 27.338 1.00 23.66 S 

-1.507 5.188 1.00 39.30 S 

-7,875 31.501 1.00 24.11 S 

8.537 -10.616 1,00 29.30 S 

13.660 52.902 1.00 36.37 S 

-0,218 39.407 1.00 27.91 S 

-1.231 7.398 1.00 27.04 S 

19.034 -1.488 1.00 24.17 S 

15.085 24.642 1.00 26.76 ^ S 

-4.327 2.438 1.00 25,87 S 

4.836 -3.572 1.00 30.43 S 

15.134 -4.164 1.00 19.63 S 

8.039 16.251 1.00 50.29 S 

6,946 51.230 1.00 36-25 S 

12,828 27.293 1.00 39.05 S 

4.064 -3.368 1.00 36.12 S 

18.307 15,547 1.00 23-71 S 

12,906 25.780 1.00 20.16 S 

29.149 25.832 1.00 14.00 S 

14.389 23.493 1.00 24.67 S 

2.189 39.362 1.00 21.88 S 

-11.307 20.713 1.00 41.06 S 

29.879 34.518 1,00 22.40 S 

29.076 4,799 1.00 20.46 S 

35.766 2.936 1.00 24,75 S 

-5.510 27,151 1.00 28.00 S 

-2,319 29.129 1.00 28.16 S 

17.542 0.448 1.00 44.65 S 

33.022 36.410 1.00 32.60 S 

20.865 25,289 1.00 32.70 S 

-3.714 1.057 1.00 29.44 S 

-0.745 36.517 1.00 31.45 S 

8.184 50.413 1.00 42.48 S 

0.783 40.789 1.00 28.34 S 

0.293 -0.546 1.00 38.96 S 

31.278 3,696 1.00 29.29 S 

0,534 26.808 1,00 31.83 S 

7.401 26.231 1.00 34,26 S 

21.505 44.460 1.00 27.47 S 

28.145 15,662 1.00 24.63 S 

23.568 44.496 1.00 34,23 S 

6.073 44.135 1,00 34.39 S 

6.103 14.181 1,00 40.74 S 

27.493 24.298 1,00 41.13 S 

21.840 42.100 1.00 28.03 S 

5.673 49,231 1.00 50.98 S 

18.256 10,204 1.00 35.15 S 

0.842 37.222 1.00 33.13 S 

14.043 29,649 1.00 33.21 S 

28,138 7.264 1,00 39.99 S 

-1.643 39.289 1.00 39.67 S 

29.635 17.427 1.00 26.70 S 

6.949 -7.606 1,00 29.50 S 

25.954 44.128 1,00 32.00 S 

24.073 1.657 1.00 49.49 S 

11,902 7.822 0.50 7.93 AC2 

13,289 7,378 0.50 7,40 AC2 

14.259 8.582 0.50 6.88 AC2 

15.705 8.092 0.50 6.81 AC2 

14,058 9.490 0.50 4.77 AC2 
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PCT/CAOO/00725 



ATOM 3109 

ATOM 3110 

ATOM 3111 
END 



GDI ILE 113 
C ILE 113 
O ILE 113 



-6,182 14.283 8.794 0.50 2.46 AC2 
-2.460 13.442 6.524 0.50 8.18 AC2 
-1.352 13.144 6.976 0.50 7.65 AC2 
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Table 4 

REMARK Model of GnT I with Acceptor. GnT I "be" with experimental ODP- 

REMARK GlcNAc and Manganese 2+ ion, with Man5GlcNAc2 acceptor modeled into 

REMARK the active site. Ulug Unligil & Dr. Jaroes Rini, October 25, 1999. 

REMARK coordinates from minimization refinement 

REMARK refinement resolution: 50.0 - 1.8 A 

REMARK starting r= 0,2113 free_r= 0.2440 

REMARK final r= 0.2103 free_r= 0.2424 

REMARK rmsd bonds= 0.005928 rmsd angles- 1.31456 

REMARK wa= 1.03895 

REMARK target*^ mlf cycles= 1 steps- 200 

REMARK sg= P2(1)2(1)2{1) a= 40.541 b= 82.190 c= 101.956 alpha= 90 beta= 90 gamma= 90 



CNS_TOPPAR : protein_rep . param 
CNS_TOPPAR : ion - param 
../../.. /data /udpglcnac. param 
CNS_TOPPAR : water_rep . param 
CNS_TOPPAR: carbohydrate . param 



REMARK parameter file 1 
REMARK parameter file 2 
REMARK parameter file 3 
REMARK parameter file 4 
REMARK parameter file 5 

REMARK molecular structure file: alternate .mtf 
REMARK input coordinates: alternate. pdb 
REMARK reflection f ile- ../../.. /data/gntlbe . cv 

REMT^K ncs= none 

REMARK B-correction resolution: 6.0 - 1.8 
REMARK initial B-factor correction applied to fobs : 
REMARK 811=^ 4.242 B22- 1.045 B33= -5.287 
REMARK B12= 0.000 B13= 0.000 B23- 0.000 

REMARK B-factor correction applied to coordinate array B: -0.095 
REMARK bulk solvent: density levels* 0.423009 e/A'^S, B-factor= 57.5717 A'"2 
REMARK reflections with 1 Fobs I /sigma_F < 0.0 rejected 
REMARK reflections with 1 Fobs I > 10000 * rms(Fobs) rejected 
REMARK anomalous diffraction data was input 

REMARK theoretical total number of refl. in resol. range: 61022 ( 100.0 % ) 

REMARK number of unobserved reflections (no entry or |F|=0): 18103 ( 29.7 % ) 
REMARK number of reflections rejected: 0 ( 0.0 % ) 

REMARK total number of reflections used: 42919 ( 70.3 % ) 

REMARK number of reflections in working set: 40743 ( 66.8 % ) 

REMARK number of reflections in test set: 2176 ( 3.6 % ) 

CRYSTl 40.541 82.190 101.956 90.00 90.00 90.00 P 21 21 21 
REMARK FILENAME="minimi2e.200.pdb" 

REMARK DATE:24-Gct-1999 23:28:47 created by user: ulu 

REMARK VERSION: 0.9a 



ATOM 


1 


CB 


ALA 


106 


-17.124 


-1. 


055 


17. 


595 


1 


.00 


30.32 


ATOM 


2 


C 


ALA 


106 


-16.456 


-1. 


029 


15. 


192 


1 


.00 


28,81 


ATOM 


3 


0 


ALA 


106 


-15.342 


-1. 


493 


15. 


418 


1 


-00 


29.56 


ATOM 


4 


N 


ALA 


106 


-18.153 


-2. 


641 


15. 


996 


1 


.00 


30,51 


ATOM 


5 


CA 


ALA 


106 


-17,606 


-1. 


259 


16. 


162 


1 


.00 


29.99 


ATOM 


6 


N 


VAL 


107 


-16.730 


-0. 


309 


14. 


111 


1 


.00 


28.51 


ATOM 


7 


CA 


VAL 


107 


-15.706 


-0, 


030 


13. 


115 


1 


.00 


26.63 


ATOM 


8 


CB 


VAL 


107 


-16.337 


0. 


167 


11. 


724 


1 


.00 


27.28 


ATOM 


9 


CGI 


VAL 


107 


-15.260 


0. 


466 


10. 


703 


1 


.00 


26,44 


ATOM 


10 


CG2 


VAL 


107 


-17.110 


-1. 


082 


11, 


329 


1 


.00 


27.22 


ATOM 


11 


C 


VAL 


107 


-14.918 


1. 


220 


13. 


496 


1 


.00 


25.88 


ATOM 


12 


0 


VAL 


107 


-15.494 


2. 


292 


13. 


686 


1 


.00 


26.98 


ATOM 


13 


N 


ILE 


108 


-13.600 


1. 


073 


13. 


602 


1 


.00 


22,52 


ATOM 


14 


CA 


ILE 


108 


-12.719 


2. 


180 


13. 


968 


1 


,00 


18.62 


ATOM 


15 


CB 


ILE 


108 


-11.829 


1. 


808 


15. 


171 


1 


.00 


17.17 


ATOM 


16 


CG2 


ILE 


- 108 


-10.916 


2. 


981 


15. 


532 


1 


.00 


18.17 


ATOM 


17 


CGI 


ILE 


108 


-12.706 


1. 


437 


16. 


369 


1 


.00 


17.36 


ATOM 


18 


CDl 


ILE 


108 


-11.919 


0. 


915 


17. 


565 


1 


.00 


15,54 


ATOM 


19 


C 


ILE 


108 


-11.819 


2. 


544 


12. 


793 


1 


.00 


15.84 


ATOM 


20 


0 


ILE 


108 


-10.843 


1. 


851 


12. 


506 


1 


.00 


16,39 


ATOM 


21 


N 


PRO 


109 


-12.138 


3. 


643 


12. 


096 


1 


.00 


15.51 
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ATOM 


22 


CD 


PRO 


109 


-13.322 


4 , 508 


12. 264 


1.00 13.90 




ATOM 


23 


CA 


PRO 


109 


-11.340 


4 .077 


10. 949 


1.00 13.38 




ATOM 


24 


CB 


PRO 


109 


-12.283 


5. 024 


10.224 


1.00 14.20 




ATOM 


25 


CG 


PRO 


109 


-12. 994 


5. 683 


11.366 


1.00 14. 4o 




ATOM 


26 


C 


PRO 


109 


-10.040 


4 .764 


11 .337 


1.00 12. 65 




ATOM 


27 


0 


PRO 


109 


-9. 937 


5. 401 


12 .396 


1 A A f\ O 

1.00 9.93 




ATOM 


28 


N 


ILE 


110 


-9.039 


4 . 603 


10 .482 


1.00 11.19 




ATOM 


29 


CA 


ILE 


110 


-7.762 


5.249 


10.698 


4 A A A A il 

1.00 9.94 




ATOM 


30 


CB 


ILE 


110 


-6.570 


4 . 350 


10 .316 


1 A A O O C 

1 . 00 9 . 93 




ATOM 


31 


CG2 


ILE 


110 


-5.259 


5, 061 


10. 665 


1 A A C "7 C 

1 . 00 5. 75 




ATOM 


32 


CGI 


ILE 


110 


-6-671 


3. 002 


11 . 031 


^ AA C 

1.00 5.91 




ATOM 


33 


CDl 


ILE 


110 


-6.678 


3 . 086 


12 . 554 


1 A A A C 1 

1.00 9.51 




ATOM 


34 


C 


ILE 


110 


-7.802 


6.416 


9.729 


1 AA 1 A CO 

1.00 10.58 




ATOM 


35 


0 


ILE 


110 


-7 .889 


6.215 


o • 514 


1 A A 1 A £0 

1.00 lU.oZ 




ATOM 


36 


N 


LEU 


111 


-7 ,772 


7 . 630 


10 . 264 


1 A A O OA 

1.00 o,d4 




ATOM 


37 


CA 


LEU 


111 


-7,791 


8 . 818 


9 .429 


1 AA O A £^ 

1 • 00 o • 4o 




ATOM 


38 


CB 


LEU 


111 


-8.636 


9. 922 


10.070 


1 AO O AT 

l.UU ^.Uo 




ATOM 


39 


CG 


LEU 


111 


-8 .517 


11 . 312 


9 . 434 


1 . 00 11 . 




ATOM 


40 


CDl 


LEU 


111 


-8.911 


11 .252 


7 . 9o5 


1 AA "7 £ZO 
1 . UU / . DO 




ATOM 


41 


CD2 


LEU 


111 


-9.403 


12. 291 


10. 183 


1 A A T OA 

1 . 00 / . o4 




ATOM 


42 


C 


LEU 


111 


-6.359 


9. 295 


r\ OTA 

9 . 270 


1 AA O QO 




ATOM 


43 


0 


LEU 


111 


-5.744 


9.765 


10 . 226 


1 A A ^ O C 

1.00 6. Jo 




ATOM 


44 


N 


VAL 


112 


-5.827 


9.146 


8 .063 


T AA *7 IT 
1 , 00 / . 1 / 




ATOM 


45 


CA 


VAL 


112 


-4 . 470 


9 . 577 


7 .764 


1 A A Q C >f 
1 . OU O . D4 




ATOM 


46 


CB 


VAL 


112 


-3-787 


8. 623 


6.751 


■1 A A IT OR 

1.00 11. 2d 




ATOM 


47 


CGI 


VAL 


112 


-2.413 


9-150 


6 . 369 


1 AA 1 A 1 >l 
1.00 10.14 




ATOM 


48 


CG2 


VAL 


112 


-3. 665 


7 .229 


7 , 351 


1 A A C A '3 

1 . 00 5 . U J 




ATOM 


49 


C 


VAL 


112 


-4 .501 


10. 985 


7 . 173 


1 A A O T C 

1 . 00 8 . /5 




ATOM 


50 


0 


VAL 


112 


-5.162 


11.231 


6 . 159 


1 A A O Q O 

1 . 00 8 . 9o 




ATOM 


51 


N 


ILE 


113 


-3.786 


11 . 904 


7 . 813 


A CA O £.f\ 

0.50 O.60 




ATOM 


52 


CA 


ILE 


113 


-3.717 


13.286 


7 . 354 


A C A O 1 >l 

0.50 8.74 




ATOM 


53 


CB 


ILE 


113 


-3.668 


14 . 279 


8 , 545 


A CA O 

0 . oO 9 . D3 




ATOM 


54 


CG2 


ILE 


113 


-3.453 


15. 701 


8 . 034 


A CA O 

0.50 9.6d 




ATOM 


55 


CGI 


ILE 


113 


-4 . 976 


14 . 217 


9.342 


A C A O 1 O 

0.50 9.12 




ATOM 


56 


CDl 


ILE 


113 


-5. 239 


12 . 886 


10 . 007 


A C A 11 O C 

0.50 11. 




ATOM 


57 


C 


ILE 


113 


-2.460 


13 . 444 


6- 507 


A CA O OA 




ATOM 


58 


0 


ILE 


113 


-1 .353 


13, 161 


6. 967 


A C A o 'a o 
0 . 50 o . j9 




ATOM 


59 


N 


ALA 


114 


-2 . 641 


13 .887 


5. 267 


1 A A €3 O A 

1 , 00 8.94 




ATOM 


60 


CA 


ALA 


114 


-1.526 


14 .067 


4 . 340 


1 A A 1 A CO 

1.00 10.39 




ATOM 


61 


CB 


ALA 


114 


-1,535 


12. 931 


3 . 301 


1 A A O "7 C 

1.00 9. 




ATOM 


62 


C 


ALA 


114 


-1.590 


15. 420 


3 . 638 


1 A A T A O 

1.00 10. bJ 




ATOM 


63 


0 


ALA 


114 


-2. 602 


16. 114 


3. 714 


1 AA O Ail 

1.00 9.U4 




ATOM 


64 


N 


CYS 


115 


-0.510 


15.783 


2 . 943 


1 AA 1 O OA 




ATOM 


65 


CA 


CYS 


115 


-0. 450 


17 ,065 


2 . 235 


1 A A 1 O O O 

1 . 00 1 J • ZZ 




ATOM 


66 


C 


CYS 


115 


0.483 


17,061 


1. 027 


n AA I O OC 

1.00 13.96 




ATOM 


67 


0 


CYS 


115 


0.035 


16. 900 


-0.114 


n A A 1 O 1 A 

1,00 13. lU 




ATOM 


68 


CB 


CYS 


115 


-0.037 


18 . 167 


3 . 220 


1 AA 1 >t 1 C 

1 . 00 14 . lo 




ATOM 


69 


SG 


CYS 


115 


0.498 


19,788 


2. 564 


-1 AA 1 O /lA 

1.00 12.60 




ATOM 


70 


N 


ASP 


116 


1.780 


17 . 224 


1.273 


1 A A 1 O il C 

1.00 13. 40 




ATOM 


71 


CA 


ASP 


116 


2,741 


17 .270 


0. 182 


•1 AA 1 "3 A 

1 . 00 13.43 




ATOM 


72 


CB 


ASP 


116 


3. 128 


18 . 729 


—0. lU^ 


JL . UU ±o . 34 




ATOM 


73 


CG 


ASP 


116 


3.695 


19,440 


1.118 


1.00 14.77 




ATOM 


74 


ODl 


ASP 


116 


3.823 


20.682 


1.071 


1.00 14.65 




ATOM 


75 


0D2 


ASP 


116 


4.018 


18.769 


2.120 


1,00 12.44 




ATOM 


76 


C 


ASP 


116 


4.001 


16.439 


0.374 


1.00 13.44 




ATOM 


77 


0 


ASP 


116 


5.041 


16.747 


-0.217 


1.00 13.25 




ATOM 


78 


N 


ARG 


117 


3.920 


15.398 


1.198 


1.00 10.57 




ATOM 


79 


CA 


ARG 


117 


5.070 


14.531 


1.420 


1.00 10.95 




ATOM 


80 


CB 


ARG 


117 


5.415 


14.453 


2.916 


1.00 9.63 




ATOM 


81 


CG 


ARG 


117 


6.084 


15.719 


3.450 


1.00 11,49 




ATOM 


82 


CD 


ARG 


117 


6.500 


15.603 


4.922 


1.00 11-38 




ATOM 


83 


NE 


ARG 


117 


5.364 


15.378 


5.808 


1.00 13.18 
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ATOM 


84 


CZ 


ARG 


117 


5.309 


15 .785 


7 .073 


1-00 


11 .51 


ATOM 


85 


NHl 


ARG 


117 


6.332 


16. 450 


7 . 606 


1 . 00 


10 .41 


ATOM 


86 


NH2 


ARG 


117 


4 .234 


15.523 


7 . 806 


1 . 00 


8.45 


ATOM 


87 


C 


ARG 


117 


4.776 


13 . 140 


0 . 869 


1 . 00 


11 . 39 


ATOM 


88 


O 


ARG 


117 


3.843 


12.4 62 


1 . 318 


1. 00 


10.86 


ATOM 


89 


N 


SER 


118 


5.572 


12 .725 


-0. 112 


^ ft ft 

1 . 00 


11 . 70 


ATOM 


90 


CA 


SER 


118 


5. 405 


11 . 421 


-0, 738 


^ ft ft 

1 . 00 


12 .38 


ATOM 


91 


CB 


SER 


118 


6.299 


11 ,301 


-1 • 977 


4 ft ft 

1 .00 


11.70 


ATOM 


92 


OG 


SER 


118 


7 , 661 


11 .509 


-1 , 644 


^ ft ft 

1 . 00 


c 4ft 

15. 48 


ATOM 


93 


C 


SER 


118 


5. 714 


10 .291 


0-226 


n ft ft 

1 . 00 


^ ft M ft 

12 . 48 


ATOM 


94 


O 


SER 


118 


5.332 


9. 146 


-0.018 


1-00 


13.74 


ATOM 


95 


N 


THR 


119 


6. 400 


10 . 615 


1 . 318 


^ ft ft 

1 . 00 


4ft ft ft 

12.22 


ATOM 


96 


CA 


THR 


119 


6-749 


9 . 618 


2.319 


1 . 00 


11.56 


ATOM 


97 


CB 


THR 


119 


7.799 


10 .163 


3. 311 


1.00 


12.32 


ATOM 


98 


OGl 


THR 


119 


7 .359 


11 . 417 


3.844 


1 . 00 


12 . 38 


ATOM 


99 


CG2 


THR 


119 


9.135 


10.361 


2- 609 


^ ft ft 

1. 00 


ft ft t 

8 . 85 


ATOM 


ICQ 


C 


THR 


119 


5. 516 


9 . 135 


3 . 083 


1 . 00 


13.70 


ATOM 


101 


O 


THR 


119 


5. 622 


8 . 380 


4 . 058 


1 (\f\ 

1 . 00 


11 . 52 


ATOM 


102 


N 


VAL 


120 


4 .342 


roc 

9 . 585 


2-649 


1 r\t\ 

1 . 00 


11.86 


ATOM 


103 


CA 


VAL 


120 


3. 107 


9 . 130 


3 . 265 


1-00 


11 , 85 


ATOM 


104 


CB 


VAL 


120 


1 . 871 


9.844 


2 . 658 


1 f\f\ 

1 . 00 


11 c c 

11 . 66 


ATOM 


105 


CGI 


VAL 


120 


1 , 800 


J*V COT 

9 , 587 


1 . 153 


n ft ft 

1 . 00 


10 . 50 


ATOM 


106 


CG2 


VAL 


120 


0.597 


9.359 


3 . 349 


1 ft ft 

1 . 00 


11 A 

11 . 43 


ATOM 


107 


C 


VAL 


120 


3 . 085 


7 . 632 


2 - 925 


1 . 00 


11 C 1 

11.51 


ATOM 


108 


0 


VAL 


120 


2.362 


6. 844 


3 . 535 


1 . 00 


9 . 54 


ATOM 


109 


N 


ARG 


121 


3.908 


7 .261 


n ft « ft 

1. 942 


1 . 00 


10.76 


ATOM 


110 


CA 


ARG 


121 


4 . 044 


5 . 873 


1 . 509 


ft ft 

1 . 00 


11 "7 ♦I 

11 . 72 


ATOM 


111 


CB 


ARG 


121 


5.079 


5.770 


0. 375 


-*i ft ft 

1 . 00 


4 4 ii ft 

11.43 


ATOM 


112 


CG 


ARG 


121 


5-338 


4-347 


-0. 129 


■4 ft ft 

1 . 00 


4 4 ft ft 

14 . 00 


ATOM 


113 


CD 


ARG 


121 


6. 479 


4 . 308 


-1 .156 


^ ft ft 

1 . 00 


4 4 ft ^ 

14 . 06 


ATOM 


114 


NE 


ARG 


121 


6. 150 


4 . 990 


ft ii ft 

-2 . 406 


1 . 00 


1 jT a t\ 

16.49 


ATOM 


115 


C2 


ARG 


121 


5.333 


4 . 507 


-3 . 335 


1 . 00 


16 . 27 


ATOM 


116 


NHl 


ARG 


121 


4 .751 


3 .327 


-3 . 162 


1 . 00 


4 ft C 

16.75 


ATOM 


117 


NH2 


ARG 


121 


5. 104 


5. 200 


-4.445 


^ ft ft 

1 . 00 


15 .29 


ATOM 


118 


C 


ARG 


121 


4 . 496 


5 . 015 


2 . 692 


1 ft ft 

1. 00 


11 . 85 


ATOM 


119 


O 


ARG 


121 


3. 944 


3 - 948 


2 . 944 


-% ft ft 

1 . 00 


10 - 32 


ATOM 


120 


N 


ARG 


122 


5.499 


5. 490 


3 . 423 


1 . 00 


1 ^ 1 ii 

12. Ih 


ATOM 


121 


CA 


ARG 


122 


6.009 


4 .752 


4 . 570 


^ ft ft 

1 . 00 


4 4 ft ^ 

11 .26 


ATOM 


122 


CB 


ARG 


122 


7 .213 


5 . 481 


5 - 170 


1 . 00 


14 . 35 


ATOM 


123 


CG 


ARG 


122 


7. 814 


4 .824 


6. 402 


1 . 00 


12 . 80 


ATOM 


124 


CD 


ARG 


122 


9, 057 


5. 583 


6. 857 


1 . 00 


1 ^ o 
16. 52 


ATOM 


125 


NE 


ARG 


122 


9. 519 


5.143 


ft 1 T ^ 

8. 171 


1 . 00 


17 . 45 


ATOM 


126 


CZ 


ARG 


122 


10. 639 


5. 565 


ft *v p 4 

8.751 


-4 ft ft 

1 . 00 


20. 14 


ATOM 


127 


NHl 


ARG 


122 


11. 423 


6.438 


ft ft ft 

8 , 132 


^ ft ft 

1 . 00 


19 . 97 


ATOM 


128 


NH2 


ARG 


122 


10. 969 


5-125 


9- 960 


1 . 00 


17 . 32 


ATOM 


129 


C 


ARG 


122 


4 . 922 


4 .574 


5- 619 


1 . 00 


1 1 

11 . 63 


ATOM 


130 


O 


ARG 


122 


4 . 805 


3 . 508 


X* ft ft ft 

6 . 228 


1 . 00 


8 . 67 


ATOM 


131 


N 


CYS 


123 


4 . 129 


5. 622 


5.828 


ft ft 

1.00 


11 . 27 


ATOM 


132 


CA 


CYS 


123 


3.042 


5.578 


6.794 


4 ft ft 

1 , 00 


10. 75 


ATOM 


133 


CB 


CYS 


123 


2.362 


6. 951 


6 . 884 


1 . 00 


4 4 4 4 

11 . 44 


ATOM 


134 


SG 


CYS 


123 


0.888 


7 .025 


7 . 946 


n ft ft 

1.00 


ft ft ^ 

9. 36 


ATOM 


135 


C 


CYS 


123 


2.027 


4.521 


6.372 


1.00 


11.83 


ATOM 


136 


0 


CYS 


123 


1.686 


3.631 


7.152 


1.00 


10.86 


ATOM 


137 


N 


LEU 


124 


1.565 


4.610 


5.127 


1.00 


11.28 


ATOM 


138 


CA 


LEU 


124 


0.576 


3.671 


4.621 


1.00 


10.13 


ATOM 


139 


CB 


LEU 


124 


0.073 


4.110 


3.236 


1.00 


10.53 


ATOM 


140 


CG 


LEU 


124 


-0.782 


5.388 


3.163 


1.00 


9.60 


ATOM 


141 


CDl 


LEU 


124 


-1.222 


5.625 


1.724 


1.00 


11.19 


ATOM 


142 


CD2 


LEU 


124 


-2.006 


5.250 


4.059 


1,00 


8.81 


ATOM 


143 


C 


LEU 


124 


1.077 


2.229 


4.563 


1.00 


10.25 


ATOM 


144 


O 


LEU 


124 


0.335 


1.309 


4.903 


1.00 


10.08 


ATOM 


145 


N 


ASP 


125 


2.324 


2.021 


4.141 


1.00 


9.94 
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ATOM 


146 


CA 


ASP 


125 


2.847 


0. 660 


4 .068 


1.00 12.02 


ATOM 


147 


CB 


ASP 


125 


4.315 


0.642 


3. 620 


1.00 12.22 


ATOM 


148 


CG 


ASP 


125 


4.488 


0, 985 


2. 144 


1.00 12.03 


ATOM 


149 


ODl 


ASP 


125 


3.515 


0.869 


1.363 


1.00 10.10 


ATOM 


150 


0D2 


ASP 


125 


5.614 


1.358 


1,759 


1.00 10,21 


ATOM 


151 


C 


ASP 


125 


2.716 


-0. 062 


5. 408 


1.00 12.46 


ATOM 


152 


O 


ASP 


125 


2.157 


-1.157 


5,471 


1.00 11.05 


ATOM 


153 


N 


LYS 


126 


3.207 


0.560 


6,478 


1.00 13.22 


ATOM 


154 


CA 


LYS 


126 


3.148 


-0.043 


7 .808 


1.00 14.05 


ATOM 


155 


CB 


LYS 


126 


3.939 


0.806 


8.806 


1.00 16.16 


ATOM 


156 


CG 


LYS 


126 


5.447 


0.756 


8 . 608 


1.00 17.78 


ATOM 


157 


CD 


LYS 


126 


5.973 


-0.667 


8.774 


1.00 18.40 


ATOM 


158 


CE 


LYS 


126 


7,488 


-0.728 


8.639 


1.00 20-47 


ATOM 


159 


NZ 


LYS 


126 


7,967 


-0.256 


7.307 


1.00 19.63 


ATOM 


160 


C 


LYS 


126 


1.730 


-0. 264 


8,336 


1.00 13.27 


ATOM 


161 


o 


LYS 


126 


1,445 


-1.299 


8 . 939 


1.00 14,34 


ATOM 


162 


N 


LEU 


127 


0.843 


0.704 


8, 115 


1.00 12.29 


ATCM 


163 


CA 


LEU 


127 


-0.540 


0.583 


8 . 573 


1.00 12.92 


ATOM 


164 


CB 


LEU 


127 


-1.329 


1.854 


8,249 


1.00 12.24 


ATOM 


165 


CG 


LEU 


127 


-1.200 


3.065 


9,169 


1.00 11.93 


ATOM 


166 


GDI 


LEU 


127 


-1.827 


4 . 282 


8 . 494 


1 . 00 11 . 59 


ATOM 


167 


CD2 


LEU 


127 


-1.887 


2,779 


10. 493 


1.00 7.66 


ATOM 


168 


C 


LEU 


127 


-1.231 


-0.598 


7 , 905 


1.00 13.38 


ATOM 


169 


o 


LEU 


127 


-1.896 


-1 . 396 


8 - 561 


1 . 00 14.13 


ATOM 


170 


N 


LEU 


128 


-1.076 


-0-689 


6. 590 


1.00 13.28 


ATOM 


171 


CA 


LEU 


128 


-1.686 


-1.753 


5.813 


4 *1 A A A 

1.00 12.88 


ATOM 


172 


CB 


LEU 


128 


-1.574 


-1, 433 


4 . 318 


1 ^0 00 

1.00 12.83 


ATOM 


173 


CG 


LEU 


128 


-2,528 


-0.348 


3. 819 


1.00 13.33 


ATOM 


174 


CDl 


LEU 


128 


-2.152 


0.091 


2.400 


1.00 10.51 
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-15.690 14.514 21.633 

-14.262 13.995 21.829 

-13.221 15.117 21.961 

-13.556 15.990 23.172 

-11-820 14.525 22.098 

-15.700 15.505 20.479 

-15.669 16.720 20.693 

-15.733 14.992 19.255 

-15.753 15.868 18.097 
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-18.213 21.835 18.188 
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-19.575 22.450 18.380 

-17.254 25.266 17.757 

-17,489 26.416 17.384 

-16.187 24.951 18.482 

-15-236 25.970 18.901 

-15.194 26.034 20.436 
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1,00 12.41 
1.00 13.86 
1.00 14.33 
1.00 13.74 
1,00 12.16 
1.00 11,89 
1.00 13.50 
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1.00 12.62 
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1.00 9.94 
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1,00 15.62 
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1,00 15-44 
1.00 13.86 
1.00 12.81 
1.00 12.81 
1.00 13,98 
1.00 15.70 
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1.00 13.55 
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1.00 12.13 
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1.00 11.64 
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1.00 15.55 
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1.00 19,89 
1,00 23-75 
1.00 19.62 
1.00 15.53 
1.00 16.03 
1.00 15.91 
1.00 16.18 
1.00 14.91 
1.00 16.00 
1.00 15.67 
1,00 15.99 
1.00 16,84 
1.00 13.30 
1.00 12.81 
1.00 11.27 
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284 -15.042 24.672 

284 -14-832 24.791 

284 -14.664 23.409 

284 -14.221 23.487 

284 -13.833 25.741 

284 -12-865 26.302 

285 -13.723 24.925 
285 -12,424 24.629 
285 -12.614 23.823 
285 -11.364 23.132 
285 -10.716 22-011 
285 -9.595 21.687 
285 -10.975 21.248 
285 -10.627 23,437 
285 -9.564 22.571 
285 -8,731 20.631 
285 -10.112 20.193 
285 -9.005 19.897 
285 -11.654 25.928 

285 -12.174 26.843 

286 -10.392 26.010 
286 -9.650 24.903 
286 -9.522 27.179 
286 -8.393 26.901 
286 -8.212 25.420 
286 -9.004 27.395 

286 -9.080 26.500 

287 -8.464 28.586 
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287 -9.185 31.165 
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287 -12.569 31.232 
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289 -1.604 26.394 
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289 -0,561 29-713 
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289 -3.356 24,603 
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290 -4.034 23.707 
290 -5.119 22.923 
290 -5.784 22.059 
290 -4.928 20.947 
290 -4.814 19.646 
290 -3.807 18.957 
290 -5.458 19.000 
290 -4.024 20.990 
290 -3.343 19.797 
290 -3.426 17,647 
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13.702 1.00 11.53 
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290 -5.080 17.700 
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14. 


955 


19. 


256 


1, 


00 


12. 


67 


ATOM 


1856 


C 


LEU 


331 


11. 


436 


12. 


597 


18. 


090 


1. 


00 


11. 


92 


ATOM 


1857 


O 


LEU 


331 


10. 


893 


11. 


492 


18. 


111 


1. 


00 


10. 


11 


ATOM 


1858 


N 


LYS 


332 


12. 


596 


12. 


821 


17. 


476 


1. 


00 


11. 


07 


ATOM 


1859 


CA 


LYS 


332 


13. 


279 


11. 


751 


16. 


754 


1. 


00 


10. 


90 


ATOM 


1860 


CB 


LYS 


332 


14. 


458 


12. 


312 


15- 


938 


1. 


00 


13. 


83 


ATOM 


1861 


CG 


LYS 


332 


15. 


640 


12. 


810 


16. 


786 


1. 


00 


16. 


82 


ATOM 


1862 


CD 


LYS 


332 


16. 


830 


13. 


261 


15. 


926 


1. 


00 


17. 


91 


ATOM 


1863 


CE 


LYS 


332 


17. 


569 


12. 


075 


15. 


323 


1. 


00 


20. 


07 


ATOM 


1864 


NZ 


LYS 


332 


18. 


804 


12. 


488 


14 , 


583 


1. 


00 


20. 


10 


ATOM 


1865 


C 


LYS 


332 


13. 


791 


10. 


637 


17. 


660 


1. 


00 


10. 


47 


ATOM 


1866 


O 


LYS 


332 


14. 


102 


9. 


549 


17. 


186 


1. 


00 


10. 


93 


ATOM 


1867 


N 


PHE 


333 


13. 


874 


10. 


895 


18, 


959 


1. 


00 


11. 


12 


ATOM 


1868 


CA 


PHE 


333 


14. 


391 


9. 


882 


19. 


877 


1. 


00 


11. 


48 


ATOM 


1869 


CB 


PHE 


333 


15. 


254 


10- 


549 


20. 


951 


1. 


00 


13. 


10 


ATOM 


1870 


CG 


PHE 


333 


16. 


436 


11, 


281 


20. 


387 


1. 


00 


13. 


67 


ATOM 


1871 


CDl 


PHE 


333 


16. 


512 


12. 


668 


20, 


454 


1. 


00 


12. 


34 


ATOM 


1872 


CD2 


PHE 


333 


17, 


.447 


10, 


,583 


19. 


729 


1. 


00 


14. 


41 


ATOM 


1873 


CEl 


PHE 


333 


17. 


,575 


13. 


.355 


19, 


868 


1. 


00 


12. 


53 


ATOM 


1874 


CE2 


PHE 


333 


18, 


.514 


11. 


.257 


19. 


,140 


1. 


00 


14. 


58 


ATOM 


1875 


CZ 


PHE 


333 


18, 


,576 


12. 


, 648 


19. 


,209 


1. 


00 


12- 


27 


ATOM 


1876 


c 


PHE 


333 


13. 


,343 


8. 


, 989 


20. 


.530 


1, 


00 


11, 


51 


ATOM 


1877 


O 


PHE 


333 


13, 


,691 


8, 


,060 


21. 


.259 


1. 


,00 


7 . 


89 


ATOM 


1878 


N 


ILE 


334 


12, 


.069 


9, 


,259 


20. 


.260 


1, 


00 


10. 


,59 


ATOM 


1879 


CA 


ILE 


334 


10, 


.995 


8, 


.454 


20. 


.828 


1, 


,00 


14. 


68 


ATOM 


1880 


CB 


ILE 


334 


9, 


.630 


9. 


.159 


20, 


.653 


1. 


,00 


14, 


,72 


ATOM 


1881 


CG2 


ILE 


334 


8. 


.517 


8. 


.326 


21. 


.255 


1. 


,00 


15. 


,09 
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ATOM 


1882 


CGI 


ILE 


334 


9. 


679 


10.533 


21, 


326 


1. 


00 


18.02 


ATOM 


1883 


CDl 


ILE 


334 


10, 


025 


10.496 


22. 


807 


1. 


00 


19.85 


ATOM 


1884 


C 


ILE 


334 


10. 


994 


7.091 


20. 


140 


1. 


00 


14.78 


ATOM 


1885 


o 


ILE 


334 


10- 


777 


6.989 


18. 


931 


1. 


00 


16,14 


ATOM 


1886 


N 


LYS 


335 


11. 


247 


6.051 


20. 


928 


1. 


00 


16.41 


ATOM 


1887 


CA 


LYS 


335 


11. 


326 


4.673 


20. 


447 


1. 


00 


17.50 


ATOM 


1888 


CB 


LYS 


335 


12. 


033 


3.823 


21. 


510 


1. 


00 


21.51 


ATOM 


1889 


CG 


LYS 


335 


12. 


071 


2.326 


21. 


234 


1. 


00 


26.07 


ATOM 


1890 


CD 


LYS 


335 


12. 


709 


1.587 


22. 


406 


1. 


00 


29.71 


ATOM 


1891 


CE 


LYS 


335 


12. 


623 


0.076 


22. 


242 


1. 


00 


32.06 


ATOM 


1892 


NZ 


LYS 


335 


13. 


329 


-0.404 


21. 


023 


1. 


00 


32.54 


ATm 


1893 


C 


LYS 


335 


9. 


999 


4.012 


20. 


069 


1. 


00 


16.08 


ATOM 


1894 


O 


LYS 


335 


9, 


001 


4.125 


20. 


783 


1. 


00 


14.22 


ATOM 


1895 




LEU 


336 


9. 


999 


3.307 


18. 


942 


1. 


00 


15.53 


ATOM 


1896 


CA 


LEU 


336 


8. 


802 


2.605 


18. 


485 


1. 


00 


13.84 


ATOM 


1897 


CB 


LEU 


336 


8. 


781 


2.516 


16. 


956 


1. 


00 


15.33 


ATOM 


1898 


CG 


LEU 


336 


7. 


555 


1.837 


16. 


328 


1. 


00 


15.30 


ATOM 


1899 


CDl 


LEU 


336 


6. 


310 


2.650 


16. 


650 


1. 


00 


12.28 


ATOM 


1900 


CD2 


LEU 


336 


7, 


732 


1.728 


14. 


813 


1. 


00 


11.38 


ATOM 


1901 


C 


LEU 


336 


8. 


782 


1.198 


19. 


063 


1. 


00 


14 .42 


ATOM 


1902 


O 


LEU 


336 


9. 


775 


0.469 


18. 


972 


1. 


00 


15.70 


ATOM 


1903 




ASN 


337 


7. 


657 


0.810 


19. 


656 


1. 


00 


13.30 


ATOM 


1904 


CA 


ASN 


337 


7, 


535 


-0.527 


20. 


217 


1. 


00 


13.41 


ATOM 


1905 


CB 


ASN 


337 


6. 


182 


-0.699 


20. 


912 


1. 


00 


12.12 


ATOM 


1906 


CG 


ASN 


337 


6. 


020 


-2.074 


21. 


519 


1. 


00 


11.55 


ATOM 


1907 


ODl 


ASN 


337 


6. 


794 


-2.473 


22, 


389 


1. 


00 


13.55 


ATOM 


1908 


ND2 


ASN 


337 


5. 


,017 


-2.810 


21. 


060 


1. 


00 


8.19 


ATOM 


1909 


C 


ASN 


337 


7. 


.676 


-1.578 


19. 


119 


1. 


00 


15.01 


ATOM 


1910 


O 


ASN 


337 


7, 


,140 


-1.417 


18- 


018 


1, 


00 


13.86 


ATOM 


1911 


N 


GLN 


338 


8. 


,396 


-2.653 


19, 


,427 


1. 


00 


15.85 


ATOM 


1912 


CA 


GLN 


338 


8, 


.619 


-3.730 


18, 


,470 


1. 


.00 


18.25 


ATOM 


1913 


CB 


GLN 


338 


10, 


.116 


-3.882 


18, 


.188 


1. 


,00 


21.65 


ATOM 


1914 


CG 


GLN 


338 


10. 


.746 


-2.660 


17, 


.533 


1, 


,00 


27,69 


ATOM 


1915 


CD 


GLN 


338 


10, 


.158 


-2.358 


16. 


.165 


1, 


.00 


30.34 


ATOM 


1916 


OEl 


GLN 


338 


10, 


.491 


-1.349 


15, 


.543 


1, 


.00 


34.83 


ATOM 


1917 


NE2 


GLN 


338 


9. 


.281 


-3.236 


15. 


.689 


1. 


.00 


31.44 


ATOM 


1918 


C 


GLN 


338 


8. 


.058 


-5.069 


18. 


.939 


1. 


.00 


17.53 


ATOM 


1919 


O 


GLN 


338 


7. 


.865 


-5.975 


18. 


.135 


1. 


.00 


17.97 


ATOM 


1920 


N 


GLN 


339 


7 , 


.806 


-5.201 


20, 


.237 


1. 


.00 


15.76 


ATOM 


1921 


CA 


GLN 


339 


7, 


.268 


-6.451 


20. 


.764 


1, 


.00 


14.91 


ATOM 


1922 


CB 


GLN 


339 


7 


.836 


-6.751 


22. 


.157 


1- 


.00 


17,40 


ATOM 


1923 


CG 


GLN 


339 


7, 


.315 


-8.061 


22. 


.742 


1, 


.00 


23.41 


• ATOM 


1924 


CD 


GLN 


339 


7 


.984 


-8.449 


24, 


.050 


1, 


.00 


26.89 


ATOM 


1925 


OEl 


GLN 


339 


7 


.642 


-9.472 


24 


.652 


1, 


.00 


29.87 


ATOM 


1926 


NE2 


GLN 


339 


8 


.942 


-7.641 


24 


.496 


1 


.00 


26.27 


ATOM 


1927 


C 


GLN 


339 


5 


.750 


-6.373 


20 


.829 


1 


.00 


12.70 


ATOM 


1928 


O 


GLN 


339 


5 


.189 


-5.524 


21 


.527 


1 


.00 


10.10 


ATOM 


1929 


N 


PHE 


340 


5 


.094 


-7.261 


20 


.089 


1 


.00 


11,33 


ATOM 


1930 


CA 


PHE 


340 


3 


.640 


-7.296 


20 


.035 


1 


.00 


11,93 


ATOM 


1931 


CB 


PHE 


340 


3 


.164 


-8.272 


18 


.950 


1 


.00 


9.13 


ATOM 


1932 


CG 


PHE 


340 


1 


.677 


-8.244 


18 


.726 


1 


.00 


9.16 


ATOM 


1933 


CDl 


PHE 


340 


1 


.127 


-7.456 


17 


.717 


1 


.00 


10.09 


ATOM 


1934 


CD2 


PHE 


340 


0 


,823 


-8.960 


19 


.556 


1 


.00 


11.24 


ATOM 


1935 


CEl 


PHE 


340 


-0 


,250 


-7.377 


17 


.542 


1 


.00 


6.38 


ATOM 


1936 


CE2 


PHE 


340 


-0 


.558 


-8.888 


19 


.391 


1 


.00 


11.09 


ATOM 


1937 


CZ 


PHE 


340 


-1 


.095 


-8.095 


18 


.382 


1 


.00 


12.06 


ATOM 


1938 


C 


PHE 


340 


3 


.027 


-7.707 


21 


.366 


1 


.00 


10.98 


ATOM 


1939 


O 


PHE 


340 


3 


.334 


-8.771 


21 


.899 


1 


.00 


10.18 


ATOM 


1940 


N 


VAL 


341 


2 


.153 


-6.856 


21 


.891 


1 


.00 


9.99 


ATOM 


1941 


CA 


VAL 


341 


1 


.465 


-7.131 


23 


.144 


1 


.00 


8.08 


ATOM 


1942 


CB 


VAL 


341 


1 


.563 


-5.929 


24 


,113 


1 


.00 


8.45 


ATOM 


1943 


CGI 


VAL 


341 


0 


.740 


-6.202 


25 


.367 


1 


.00 


9.42 
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T\ TOM 


1944 


CG2 


VAL 


341 


3.020 


-5.667 


24.481 


1.00 


6.53 


n*rnM 


1945 


c 


VAL 


341 


-0,012 


-7.383 


22.817 


1.00 


9.63 


UTOM 

WW 


1946 


o 


VAL 


341 


-0.657 


-6.559 


22.162 


1.00 


8.38 


2XTOM 


1947 


N 


PRO 


342 


-0.559 


-8.534 


23.243 


1.00 


8.89 


a TOM 


1948 


CD 


PRO 


342 


0.084 


-9.668 


23.936 


1,00 


8.68 


ATOM 


1949 
J. ^ t ^ 


CA 


PRO 


342 


-1-969 


-8.825 


22.962 


1,00 


9,24 


ATOM 


1950 


CB 


PRO 


342 


-2.053 


-10.337 


23.159 


1,00 


7.94 


ATOM 


1951 


CG 


PRO 


342 


-1.115 


-10.560 


24.302 


1.00 


6.57 


AT AM 


1952 


c 


PRO 


342 


-2.877 


-8.065 


23,934 


1.00 


7.98 


2VTOM 


1953 


o 


PRO 


342 


-3,531 


-8.671 


24,784 


1.00 


10.70 


i-v X \Jiri 


1954 


N 


PHE 


343 


-2.910 


-6.741 


23.804 


1,00 


8.97 


n X yjct 


1955 


CA 


PHE 


343 


-3.732 


-5-908 


24.679 


1.00 


7.44 


nX wii 


1956 


CB 


PHE 


343 


-3.730 


-4.445 


24,220 


1.00 


7.91 


ATOM 


1957 


CG 


PHE 


343 


-2.430 


-3.722 


24.464 


1.00 


8.27 


ATOM 


1958 


CDl 


PHE 


343 


-1.543 


-3.489 


23.420 


1.00 


8.46 


ATOM 
/>X wli 


1959 


CD2 


PHE 


343 


-2.116 


-3.239 


25.729 


1.00 


8.30 


ATOM 


I960 


CEl 


PHE 


343 


-0.363 


-2.779 


23.631 


1.00 


10.11 


ATOM 


1961 


CE2 


PHE 


343 


-0.939 


-2.530 


25.952 


1,00 


10.97 


ATOM 


1962 


cz 


PHE 


343 


-0.061 


-2.297 


24.902 


1.00 


8.08 


ATOM 


1963 


c 


PHE 


343 


-5.173 


-6,373 


24.797 


1.00 


8,71 


ATOM 


1964 


o 


PHE 


343 


-5.769 


-6.238 


25.866 


1.00 


7.12 


ATOM 


1965 


N 


THR 


344 


-5.743 


-6.909 


23.718 


1.00 


7.51 


ATOM 


1966 

■X 4^ W W 


CA 


THR 


344 


-7.131 


-7.376 


23.779 


1,00 


9.69 


ATOM 


1967 


CB 


THR 


344 


-7.697 


-7.748 


22.374 


1.00 


9.08 


ATOM 
nx WL ^ 


1968 


OGl 


THR 


344 


-6.945 


-8,829 


21.809 


1.00 


13.42 


ATOM 

X^X i 


1969 


CG2 


THR 


344 


-7.643 


-6.543 


21.438 


1.00 


10.33 


ATOM 


1970 


c 


THR 


344 


-7.314 


-8.573 


24.720 


1.00 


11.50 


ATOM 


1971 


O 


THR 


344 


-8.440 


-8.898 


25.101 


1.00 


11.32 


ATOM 
fix v/xx 


1972 


N 


GLN 


345 


-6.215 


-9.215 


25.113 


1.00 


11.97 


ATOM 


1973 


CA 


GLN 


345 


-6.290 


-10.365 


26.021 


1.00 


10.36 


ATOM 

XIX V/* ■* 


1974 


CB 


GLN 


345 


-5.310 


-11.463 


25.594 


1.00 


11.78 


ATOM 


1975 


CG 


GLN 


345 


-5.523 


-12.052 


24,195 


1.00 


11.91 


ATOM 

X% X wX J 


1976 


CD 


GLN 


345 


-4.483 


-13.116 


23.878 


1.00 


14.76 


ATOM 

XIX 1 


1977 


OEl 


GLN 


345 


-3-432 


-13.163 


24.514 


1.00 


13.98 


ATOM 

X^X v/4 i 


1978 


NE2 


GLN 


345 


-4.764 


-13.964 


22.889 


1.00 


13.27 


ATOM 

n X WL * 


1979 


c 


GLN 


345 


-5.954 


-9.960 


27.457 


1.00 


12.23 


ATOM 


1980 


o 


GLN 


345 


-5.921 


-10.795 


28.359 


1.00 


11.22 


ATOM 


1981 


N 


LEO 


346 


-5.719 


-8.673 


27.666 


1.00 


11.87 


ATOM 


1982 


CA 


LEU 


346 


-5.349 


-8,161 


28.980 


1.00 


12.41 


ATOM 


1983 

X. «/ w •»/ 


CB 


LEU 


346 


-4.174 


-7,205 


28.808 


1.00 


12.68 


ATOM 


1984 


CG 


LEU 


346 


-3.042 


-7.819 


27.986 


1.00 


13.31 


ATOM 


1985 


CDl 


LEU 


346 


-1-928 


-6.812 


27.778 


1.00 


12.89 


ATOM 

fix WX J 


1986 

X- -y w V/ 


CD 2 


LEU 


346 


-2.525 


-9.044 


28,702 


1.00 


11.46 


ATOM 

f>X wl J 


1987 

X. V ' 


c 


LEU 


346 


-6.469 


-7.457 


29,746 


1,00 


12.48 


ATOM 


1988 


o 


LEU 


346 


-7.440 


-6.994 


29.156 


1.00 


13.97 


ATOM 


1989 


N 


ASP 


347 


-6.310 


-7.364 


31.065 


1.00 


11.50 


ATOM 


1990 


CA 


ASP 


347 


-7.294 


-6.714 


31.929 


1.00 


11,90 


ATOM 


1991 


CB 


ASP 


347 


-7.346 


-7.390 


33,304 


1.00 


12.47 


ATOM 


1992 


CG 


ASP 


347 


-8.432 


-6.819 


34.194 


1.00 


17.32 


ATOM 


1993 


ODl 


ASP 


347 


-8.858 


-5.668 


33.949 


1.00 


15.32 


ATOM 


1994 


OD2 


ASP 


347 


-8.853 


-7,522 


35.145 


1.00 


17.61 


ATOM 


1995 


C 


ASP 


347 


-6.863 


-5.271 


32-118 


1.00 


12.21 


ATOM 


1996 


0 


ASP 


347 


-6.032 


-4.983 


32.979 


1.00 


10.38 


ATOM 


1997 


N 


LEU 


348 


-7.421 


-4.366 


31,322 


1.00 


11.56 


ATOM 


1998 


CA 


LEU 


348 


-7.064 


-2.957 


31.416 


1,00 


13.74 


ATOM 


1999 


CB 


LEU 


348 


-7; 173 


-2.293 


30.035 


1.00 


14.79 


ATOM 


2000 


CG 


LEU 


348 


-6.576 


-3.023 


28.826 


1.00 


13.77 


ATOM 


2001 


CDl 


LEU 


348 


-6.741 


-2.172 


27.571 


1,00 


14.06 


ATOM 


2002 


CD2 


LEU 


348 


-5.119 


-3.303 


29.060 


1.00 


13.73 


ATOM 


2003 


C 


LEU 


348 


-7.948 


-2.187 


32,400 


1.00 


15.03 


ATOM 


2004 


O 


LEU 


348 


-8.051 


-0.967 


32.305 


1.00 


12.96 


ATOM 


2005 


N 


SER 


349 


-8.578 


-2.881 


33.344 


1.00 


16.06 
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ATOM 


2006 


CA 


SER 


349 


-9.456 


-2.193 


ATOM 


2007 


CB 


SER 


349 


-10.304 


-3,201 


ATOM 


2008 


OG 


SER 


349 


-9.506 


-4.003 


ATOM 


2009 


C 


SER 


349 


-8.710 


-1.287 


ATOM 


2010 


0 


SER 


349 


-9.315 


-0.435 


ATOM 


2011 


N 


TYR 


350 


-7.400 


-1.466 


ATOM 


2012 


CA 


TYR 


350 


-6.605 


-0.645 


ATOM 


2013 


CB 


TYR 


350 


-5.209 


-1.257 


ATOM 


2014 


CG 


TYR 


350 


-4.331 


-1.216 


ATOM 


2015 


CDl 


TYR 


350 


-3.429 


-0.168 


ATOM 


2016 


CEl 


TYR 


350 


-2.608 


-0,137 


ATOM 


2017 


CD2 


TYR 


350 


-4-393 


-2.228 


ATOM 


2018 


CE2 


TYR 


350 


-3.583 


-2.204 


ATOM 


2019 


CZ 


TYR 


350 
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,762 
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ATOM 


2750 
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-7. 


168 


20. 


232 


44 . 


494 


1. 


00 


29. 


14 


ATOM 


2751 
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871 
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116 


42. 


657 
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00 


20. 


57 


ATOM 


2752 


0 
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-5. 
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119 
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699 
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00 


21. 


05 
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2753 


N 


GLY 
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636 


41. 


519 
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00 


17. 


58 
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2754 
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690 


21. 
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16. 


34 
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2755 


C 
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512 
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614 
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00 
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68 
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2756 
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587 
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40. 


294 
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00 


16. 


80 
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2757 


N 
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22. 


113 
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1. 


00 


15. 


98 
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471 
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753 
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555 


1. 


00 


14. 


71 


ATOM 


2759 


CB 
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471 
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102 


1. 


00 


13. 


29 
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2760 


CG 
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20. 


814 


35. 


947 
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00 


11. 


43 


ATOM 


2761 
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-3. 


146 


19. 
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35. 
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00 


10. 


71 
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58 
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013 
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14 
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879 


1- 


00 


9. 


35 
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509 
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1- 


00 
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77 


ATOM 
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16. 
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563 
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00 


9. 


42 
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585 
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48 
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567 


24. 
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1. 
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17. 


32 
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24. 
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1. 


00 


14. 


75 
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26. 
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316 


1. 


00 
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10 
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26. 
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1. 


00 


15. 


62 
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2772 
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296 


28. 
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1. 


00 


16. 


83 


ATOM 
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654 
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027 


37. 


666 


1. 


00 


18. 


29 


ATOM 


2774 
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0. 


045 


28. 


747 


39. 


726 


1. 


00 


19. 


92 
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-0. 


736 


26. 


762 


35. 


945 


1. 


00 


14. 


68 


ATOM 


2776 


O 


ASP 
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0. 


447 


26. 


607 


35. 


644 


1. 


00 


17. 


52 


ATOM 


2777 


N 


PRO 


444 


-1. 


618 


27. 


301 


35. 


092 


1- 


00 


14, 


77 


ATOM 


2778 


CD 


PRO 


444 


-3. 


036 


27. 


626 


35- 


324 


1. 


00 


15. 


73 


ATOM 


2779 


CA 


PRO 


444 


-1. 


195 


27. 


721 


33. 


751 


1. 


00 


14. 


22 


ATOM 


2780 


CB 


PRO 


444 


-2. 


496 


28, 


224 


33. 


122 


1. 


00 


15. 


85 


ATOM 


2781 


CG 


PRO 


444 


-3. 


281 


28. 


715 


34. 


309 


1. 


00 


17. 


54 


ATOM 


2782 


C 


PRO 


444 


-0. 


074 


28. 


757 


33. 


731 


1. 


00 


13. 


23 


ATOM 


2783 


o 


PRO 


444 


0. 


558 


28. 


970 


32. 


698 


1, 


00 


14. 


12 


ATOM 


2784 


N 


SER 


445 


0. 


189 


29. 


.387 


34. 


872 


1. 


00 


13- 


02 


ATOM 


2785 


CA 


SER 


445 


1. 


248 


30, 


,390 


34. 


.942 


1. 


,00 


14. 


12 


ATOM 


2786 


CB 


SER 


445 


1. 


094 


31. 


,267 


36. 


.194 


1. 


.00 


15. 


43 


ATOM 


2787 


OG 


SER 


445 


1. 


249 


30. 


.515 


37, 


.382 


1. 


.00 


14. 


51 


ATOM 


2788 


C 


SER 


445 


2. 


617 


29, 


,718 


34. 


.935 


1. 


.00 


15. 


13 


ATOM 


2789 


o 


SER 


445 


3. 


646 


30. 


.386 


34. 


.811 


1, 


.00 


15. 


81 


ATOM 


2790 


N 


TRP 


446 


2. 


623 


28. 


.396 


35. 


.080 


1. 


.00 


15. 


62 


ATOM 


2791 


CA 


TRP 


446 


3. 


867 


27. 


.628 


35, 


.045 


1, 


.00 


16. 


04 


ATOM 


2792 


CB 


TRP 


446 


3. 


,671 


26, 


.239 


35. 


.667 


1. 


.00 


15. 


23 


ATOM 


2793 


CG 


TRP 


446 


3. 


790 


26. 


.180 


37. 


.179 


1. 


.00 


15. 


78 


ATOM 


2794 


CD2 


TRP 


446 


4. 


,4 92 


25, 


.187 


37. 


.948 


1, 


.00 


14. 


.80 


ATOM 


2795 


CE2 


TRP 


446 


4. 


,303 


25, 


.502 


39, 


.313 


1. 


.00 


15. 


, 90 


ATOM 


2796 


CE3 


TRP 


446 


5. 


.261 


24. 


.064 


37, 


.613 


1. 


,00 


13. 


03 


ATOM 


2797 


CDl 


TRP 


446 


3. 


.221 


27, 


.032 


38. 


.088 


1, 


.00 


18. 


,10 


ATOM 


2798 
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TRP 


446 


3. 


,526 


26. 


.630 


39. 


.371 


1, 


.00 


17, 


.03 


ATOM 


2799 


CZ2 


TRP 


446 


4, 


.856 


24. 


.732 


40. 


.345 


1, 


.00 


16. 


.70 


ATOM 


2800 


CZ3 


TRP 


446 


5, 


.811 


23. 


.298 


38, 


.638 


1. 


.00 


14. 


.05 


ATOM 


2801 


CH2 
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5. 


.604 


23. 


.637 


39, 


.989 


1. 


.00 


14. 


.48 
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2802 


C 
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4. 


.175 


27. 


.486 


33, 


.561 


1. 


.00 


15, 


.56 
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2803 


o 
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3. 


.544 


26, 


.692 


32, 


.870 


1 


.00 


15. 


.44 
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2804 


N 


THR 


447 


5< 


.128 


28. 


.269 


33 


.066 


1, 


.00 


16. 


.52 
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CA 


THR 
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.480 


28 


.219 


31 


.651 
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.00 


17, 


.33 
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2806 


CB 
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5. 


.159 


29 


.557 


30 


.948 


1 


.00 


16, 


.01 


ATOM 
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THR 
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5. 


.880 


30 


.621 


31 
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.00 
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.51 


ATOM 
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THR 
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3. 


.662 


29 


.856 


31 


.025 
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.00 


17. 


.76 
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THR 
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6 


.951 


27 


.895 


31 


.427 
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.00 


17. 


.48 


ATOM 
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O 


THR 
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7. 


.684 


27 


.735 


32 


.422 


1 


.00 


16. 


.49 


ATOM 
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0T2 


THR 


447 


7 


.350 


27 


.808 


30 


.248 


1 


.00 


20, 


.48 



wo 00/78936 



236 



ATOM 
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-2. 
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475 
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-0. 
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14. 


470 
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667 
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S 
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960 
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8. 


342 
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2. 
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305 
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810 
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9. 
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7. 


228 
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2820 
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S 


9 


-1. 


762 


-4. 


348 
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10 


1, 
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9- 


499 
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2822 
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TIP 


s 
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1. 


765 
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754 
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2823 
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s 


12 


-16. 


811 


19. 


244 


ATOM 


2824 


OH2 


TIP 


s 


13 


3. 


100 


38. 


4 68 


ATOM 


2825 


OH2 


TIP 


s 


14 


4. 


747 


35. 


913 


ATOM 


2826 


OH2 


TIP 


s 


16 


7. 


477 


18. 


698 


ATOM 


2827 


OH2 


TIP 


s 


17 


-1. 


635 


-3. 


163 


ATOM 


2828 


OH2 


TIP 


s 


18 


8. 


703 


4. 


872 
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2829 
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655 
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-6. 


026 
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471 
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9. 


813 


-5. 


306 
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7- 
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25. 
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-6. 


847 
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856 
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2834 
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24 


5. 


387 


11. 


634 


ATOM 


2835 


OH2 


TIP 
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25 


-3. 


336 


18. 


278 


ATOM 


2836 


OH2 


TIP 


s 


26 


3. 


57 6 


28. 


559 


ATOM 


2837 


OH2 
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27 


-4. 


485 


23. 


477 
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s 


28 


-0. 


469 


15. 


386 
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2839 
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8. 


632 
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162 


ATOM 


2840 


OH2 


TIP 


s 


30 


-5. 
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306 
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9. 
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9. 
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7. 
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17. 
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37 


18. 
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485 
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2847 
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s 


39 


8. 
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-0. 
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2848 
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40 
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13. 


733 
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2849 
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s 


41 


3- 


595 


14. 


951 
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2850 
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TIP 


s 


42 


14. 
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-1. 


569 
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s 
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2. 


428 


16. 


587 
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2852 
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2. 
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4. 
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2. 
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26. 
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3. 
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4. 
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-9. 
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26. 


333 
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s 


49 


16. 
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3, 
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s 
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-1. 
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2858 
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s 


51 


-6. 


295 


-4. 


666 


ATOM 


2859 
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TIP 


s 


52 


3. 


718 


17. 


317 


ATOM 


2860 


OH2 


TIP 


s 


53 


-4. 


530 


1. 


420 
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2861 
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TIP 


s 


55 


5- 
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-7. 


098 


ATOM 


2862 
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TIP 


s 


56 


12. 


858 


14. 


464 


ATOM 


2863 


OH2 


TIP 


s 


57 


-6. 


993 


13. 


007 


ATOM 


2864 


OH2 


TIP 


s 


59 


1. 


946 


-2. 


718 


ATOM 


2865 


OH2 


TIP 


s 


61 


-13. 


046 


10. 


832 


ATOM 


2866 


OH2 


TIP 


s 


62 


-9. 


009 


30. 


421 


ATOM 


2867 


OH2 


TIP 


s 


63 


-5. 


368 


14. 


834 


ATOM 


2868 


OH2 


TIP 


s 


64 


15. 


902 


7. 


116 


ATOM 


2869 


OH2 


TIP 


s 


65 


-12. 


026 


28. 


,356 


ATOM 


2870 


OH2 


TIP 


s 


66 


-21. 


274 


19. 


,841 


ATOM 


2871 


OH2 


TIP 


s 


67 


2. 


502 


9. 


.579 


ATOM 


2872 


OH2 
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s 


71 


17. 
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6, 


.826 


ATOM 


2873 


OH2 


TIP 


s 


73 


-10. 


,419 


6. 
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15.320 1.00 6.09 S 

42.726 1.00 8.02 S 

30.895 1.00 12.21 S 

13.894 1.00 11.59 S 
35.349 1.00 11.45 S 
22.128 1.00 10.09 S 
25,262 l.Op 10.29 S 

29.895 1,00 8.92 S 
15,634 1.00 4.51 S 

31.865 1.00 11.62 S 
2.497 1.00 6.99 S 
1.628 1.00 12.23 S 

12.074 1.00 8.94 3 

12.901 1.00 10.38 S 

30.699 1.00 9.51 S 

30.255 1.00 8.73 S 

23.179 1.00 11.50 S 

14.762 1.00 8.17 S 

4.905 1.00 9.25 S 

24,495 1.00 14.00 S 

29.067 1.00 14.85 S 

13.669 1.00 12.68 S 

5-366 1.00 10.13 S 

28.003 1,00 11.28 S 

8.666 1.00 16.37 S 

32.752 1.00 11.50 S 

17.048 1.00 8.13 S 

-8.613 1.00 34.24 S 

39.866 1,00 9.20 S 
28.936 1.00 14.92 S 
12.006 1.00 17.95 S 
27.132 1.00 10.60 S 
17,048 1,00 18.33 S 
-4.527 1.00 9,34 S 
49.195 1.00 23.24 S 
35.357 1.00 19.31 S 
15.879 1.00 8.85 S 
12.766 1.00 13.64 S 
29-631 1.00 17.35 S 

4,597 1.00 8.45 S 

9.665 1,00 9.99 S 

30.638 1.00 18.05 S 

-6.137 1.00 17.93 S 

11.553 1.00 15.39 S 
31.969 1.00 15.56 S 
12.842 1,00 15.27 S 
35.782 1.00 11.85 S 
-7.652 1.00 13.51 S 
38.605 1.00 16.40 S 

35.554 1.00 10.92 S 
46.386 1.00 16.92 S 
38.760 1.00 21.66 S 

3.046 1.00 10.92 S 

22.809 1.00 15.93 S 

17.194 1.00 14.15 S 

28.617 1.00 15.13 S 

22,062 1,00 15.93 S 

21.179 1.00 21.81 S 

17.262 1.00 17.46 S 

35.754 1.00 11.78 S 

25.546 1.00 24.21 S 

-3.028 1.00 18.73 S 



wo 00/78936 



..i.Q.O JL.S B i& '9 .^ SS: O.S.a2 
PCT/CAOO/00725 



ATOM 


2874 


OH2 


TIP 


S 


76 
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1.542 
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s 
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s 
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-19.029 
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s 


84 


3.507 


ATOM 


2880 


OH2 


TIP 


3 


85 


-10.490 


ATOM 


2881 


OH2 


TIP 


S 


86 
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ATC»« 


2882 


OH2 


TIP 


s 
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-12.608 


ATOM 
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s 
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5.224 


ATOM 
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s 


89 


6.791 
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s 
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-16.107 


ATOM 


2886 
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s 


91 


6.958 


ATOM 
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TIP 


s 


93 


3.503 


ATOM 


2888 


0H2 


TIP 


s 


94 


3,974 


ATOM 


2889 


0H2 


TIP 


s 


95 


2.755 


ATOM 


2890 


OH2 


TIP 


s 


96 


11.034 


ATOM 


2891 


OH2 


TIP 


s 


97 


0.170 


ATOM 


2892 


OH2 


TIP 


s 


98 


-23.158 


ATOM 


2893 


OH2 


TIP 


s 


100 


-3.421 


ATOM 


2894 


OH2 


TIP 


s 


102 


6.509 


ATOM 


2895 


OH2 


TIP 


s 


105 


-2.000 


ATOM 


2896 


OH2 


TIP 


s 


106 


-12.623 


ATC»« 


2897 


OH2 


TIP 


s 


107 


3.382 


ATOM 


2898 


OH2 


TIP 


s 


108 


-7.135 


ATOM 


2899 


0H2 


TIP 


s 


112 


14.466 


ATOM 


2900 


OH2 


TIP 


s 


113 


4 .501 


ATOM 


2901 


OH2 


TIP 


s 


114 


-18.595 


ATOM 


2902 


OH2 


TIP 


s 


115 


2.021 


ATOM 


2903 


OH2 


TIP 


s 


117 


9.096 


ATOM 


2904 


0H2 


TIP 


s 


118 


-9.458 


ATOM 


2905 


OH2 


TIP 


s 


119 


0.906 


ATOM 


2906 


OH2 


TIP 


s 


122 


9.275 


ATOM 


2907 


OH2 


TIP 


s 


123 


15.065 


ATOM 


2908 


OH2 


TIP 


s 


124 


-12.335 


ATOM 


2909 


OH2 


TIP 


s 


125 


-5.724 


ATOM 


2910 


OH2 


TIP 


s 


126 


12.084 


ATOM 


2911 


OH 2 


TIP 


s 


128 


-8.595 


ATOM 


2912 


0H2 


TIP 


s 


129 


5.131 


ATOM 


2913 
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TIP 


s 


132 


-7.436 


ATOM 


2914 


OH2 


TIP 


s 


133 


-8.171 
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113 


-4. 


858 


14. 


050 


9. 


,473 


0. 


.50 


4 


.62 


ATOM 


3172 


CDl 


ILE 




113 


-6. 


,196 


14. 


,256 


8. 


.790 


0. 


.50 


2 


.31 


ATOM 


3173 


C 


ILE 




113 


-2, 


.458 


13. 


,445 


6. 


508 


0. 


.50 


8 


.03 


ATOM 


3174 


0 


ILE 




113 


-1. 


.350 


13. 


,164 


6. 


, 965 


0. 


.50 


7 


.50 



A 
A 
A 
A 
A 



AC2 
AC2 
AC2 
AC2 
AC2 
AC2 
AC2 
AC2 



END 
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Table 5 Intermolecular contacts of GnT-l-UDP-GlcNAc Complex and GnT-l-MansGlcNAca Complex 



No. of Atomic 
Interaction 


Nucleotide Sugar 
Donor or Acceptor 
Atomic Contact 


Enzyme Atomic 
Contact 


Distance Between 
Atomic Contacts 


Atomic Interaction 
Property 


1 


Uracil 02 


H1S-190ND1 


2.7 


HB 


2 


Uracil N3 


Asp 144 


2,8 


HB 


3 
4 


Uracil Ring 


Cys 115-Cys 145 
He 187 


3.7 
3.8 


VW 
VW 


5 


Uracil C5 


Val 321 


3.6 


VW 


6 
7 


Ribose03'(H) 
02'(H) 


Asp 212 
Asp 212 


2.9 

3.2 direct, and via 
water: 2.9 to water, 
t^o^ o.u lo /vspz.1.^ 


HB 
HB 
HB 


s 

9 
10 


a Phosphate 
P-phosphate 


Arg 117NH 
Val 321 
Scr 322 


2.8 
2.7 
2.5 


SB 
HB 
HB 


11 
12 

13 


Loop Structure 
a-phosphate 

p-phosphate 


Val 321 
Asp 116 

Scr 322 


2.7 

via 2-8 water, 2.8 to 
secona waier, z. / lo 
Asp 
2.5 


HB 

HB,HB 

un 
iii> 

HB 


14 


GlcNAc03 


Glu211 


2.7 


HB 


15 
16 
17 


06 


Phe 289, 
Trp290 
Tyrl84 


via 2.7 to water , 2.8 
3.2 
2.9 


HB.HB 

HB 

HB 


18 
19 


04 


Glu211 
Trp290 


2.6 
2.8 


HB 
HB 


20 
21 


CHj 


Leu 269 
Leu 331 


3.4 
3.3 


VW 
VW 


22 


a-1,3, mannose 
02 


Asp 291 ODl 


2.4 


HB 


23 
24 


03 


Asp291 
Arg 295 


3.1 
2.9 


HB 
HB 


25 

26 
27 


04 

06 
C6 


Arg415 

Ser 322 
Phe 326 


via 2.6 water 2.5 to 
Arg 

2.6 
3.6 


HB 

HB 
VW 



HB: hydrogen bond interaction 
VW Nad der Waals 
SB: salt bridge 
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Tsbk ^ CTystallpgraphic <iata gn^ rgfmgmgnt statistics: 



Derivative (McHgCI) 





Edge 




Peak 


Native 


Complex with UDP-GlcNAc 
and Mn^' 


Crystal parameters: 












Space group 




P2|2,2, 




P2i2,2, 


P2,2,2, 


a (A) 




40.4 




40.5 


40.5 


P(A) 




82.4 




82,4 


82.2 


Y(A) 




102.5 




102.5 


102.0 


Dif&action statistics: 












Wavelength (A) 


1.0093 




1.0075 


0.9914 


1.0713 


Resolution Range (A) 


31.72-1. 


4 




38.24-1.5 


34.25-1.8 


Measured reflections (n) 


34802S 




325287 


401605 


64537 ^ 


Unique relections (n) 


102627 




102233 


99934 


42919 


Completeness (%) 


78.7 




78.4 


94,2 


70.3 


R«m* 


0.047 




0.053 


0.065 


0.092 


Sites (n) 


1 




I 






Phasing Powcrt; 












Dispersive 







3.64 


_ 




Anomalous 


2.26 




2.65 






Figure of Merit, before 


0.581 






_ 




Solvent flattening 












Refinement statistics: 














0.167 






0.166 


0.185 


Rft« 


0.189 




- 


0.194 


0.229 


Total atoms (n) 


3204 






3167 


3138 


Protein 


2710 






2710 


2811 


Substrate 


0 






0 


40 


Water 


492 






457 


275 


Rmsd} bond length (A) 


0.011 






0.013 


0.010 


Rmsd bond angle (**) 


L5 






1.6 


1.5 


Mean B value (A*) 


14.2 






14.4 


16.2 


Protein 


12.3 






12.3 


16.0 


Domain 1 (106-317) 


n.5 






11.3 


14.2 


Loop (3 18-330) 










283 


Linker (331-353) 


12.1 






12.1 


15.4 


Domain 2 (354-447) 


14.1 






14,6 


18.7 


Substrates 










23.0 


Water 


26,6 






27.9 


25.9 



* R.yin = 1^ - <^ I / U where / is the observed intensity and <t> is the average intensity obtained from multiple 
observations of symmetry-related reflections, f Phasing power, root mean square (mfis) iVrms where €\% lack of 
closure and Fh is the calculated heavy atom structure factor. % Rmsd, root mean squared deviation 



SUBSTITUTE SS EET 
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Table 7 

The UDP-GlcNAc binding site. Listed here are the distances between the UDP-GlcNAc, the Mn^*, bound 
waters, and the protein atoms involved in their binding. 



Interacting 


Atoms 


Distance (A> 


interactinQ 


Atoms 


Distance (A) 


Uracil N3 


D144 OD2 


2.8 


GlcNAc 06 


H2O4 


2.7 


Uracil 02 


H190 ND1 


2.7 




D213 OD2 


2.1 


Ribose 02' 


D212 OD1 


3.2 




H2O 38 


2.4 


Ribose 02' 


H2O4O 


2.9 




H2O87 


2.4 


Ribose 03' 


D212 OD1 


2-9 




H2OII6 


2.1 


a-Phosphate 01A 


V321 N 


2.7 


H2O4 


Y184 0 


2.9 


a-Phosphate 01A 


H20 72 


2.8 


H2O4 


F289N 


2.8 


a-Phosphate 02A 


R117NH2 


2.8 


H2O4 


W290N 


3.2 


a-Phosphate 02A 


Mn^ 


2.1 


H2O27 


L269N 


3.0 


p-Phosphate 01 B 


S322 OG 


2.5 


H2O38 


E211 OE1 


2.4 


p.Phosphate 02B 




2.1 


H2O38 


D213 OD1 


2.8 


GlcNAcO? 


H2O 263 


2.8 


H2O4O 


D212 002 


3.0 


Glcl^c03 


E211 OE1 


2.7 


H2O 87 


T315 OG1 


3.0 


GlcNAc03 


H20 27 


2.6 


H2OII6 


G317 0 


2.6 


GlcNAc 04 


E211 OE2 


2.6 


H2O 263 


D291 OD1 


2.9 


GlcNAc 04 


W290 NE1 


2.8 


H2O 263 


R295 NH2 


3.0 
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Table 8 

Protein threading results. Proteins from different families were threaded against a THREADER 2 database 
containing 1900 protein folds, including GnT 1, spsA, GimU, and p4Gal-Tl. The folds were sorted on the 
basis of their filtered combined energy Z-scores, When a GTCD-1 -containing fold was one of the top thirty 
hits, out of 1900, then the top thirty hits were rerun with a randomization test of fifty shuffled-sequence 
threadings for each fold, to give a combined energy shuffled 2-score. A correct prediction should score well 
in both tests. 



Family Class 



Pmtein /GenBank Gl number) 


Top GTCD-1- 


Z-soore (rank) 


Randomization Test 


containing Hit 




Z-score (rank) 


I^BuItUo X nyOftOa ulJ"-nioilinw5ws aiiuiUi«yainuiii~<9' 




2.33 1101 


3.28 (8) 


SluoosiQsmafnnooy'uansierfidc su/ } 


spsA 


2.02(2) 


4.59 (1) 


H influenzae IgtD (1074167) 


S. cerevisiae Gtycogen (Starch] Synthase. Isofbmi 


^4C>al-T1 


2.90(2) 


4.47(1) 










Saimonelta typhimurium Lipopolysaccharide 


GImU 


2.63(3) 


3.81 (4) 


acetylglusosaminetFansferase ifaK (132468) 








^iaeiis dvssntBfiae oaladc^vl-transfefase RfbB 
Trifir-itm a^^Mtm rtranute-boufwl starch fivnthase 


GImU 


2.61 (5) 


0.52 (25) 


GImU 


2.41 (81 


0.72 (14) 


/1 367651 








Hnmn saDisns lustD-blood OfOUD A tfansfeiase 


GImU 


3.09 (1) 


2.28 (1) 


GnT 1 / spsA 


3.12(1)/ 2.63 


3.47 (3)74.95 (1) 






(5) 


11.75 (1) 




p4Gal-T1 


14.98 (1) 


dn/rfntuniiv euinie:tjiu& Glvcooenin-I ^417075) 


GnT 1 


2.48 (5) 


3.49 (1) 


anrrtAtalla w»rtiivsbe rfaC 19929701 


GimU 


2 59 (6) 


1.31 (19) 


M/tmrt «anMn<c PimMvKrarKfecase 5 / 1730 1351 


GImU 


2 90 (1) 


1 60 (16) 


rtOntO sapiens rucosyRnansieioae i | i^Wi^ai 


GImU 


3 46 (11 


1.73 (14) 


riontO sapiGns vaivuc/ou^ synuioae \ i loof <9o/ 




2 80 (21 


1 24 (10) 


C. e/egans gly-14 (3420B44) 


GnT 1 


20.26 (11 


12.34 (1) 


Hotno sapiens uorez laiCNMC-iransTerase ^d^^oou; 


. span 


3 13 #41 


S 05 711 


Candida albicans putative mannosyttransferase 




2.37 (13) 


1.74 (10) 


Mnti (1480086) 








Homo sapiens GnT II (1708004) 


SpsA 


2.84 (2) 


4.53 (1) 


Homo sapiens GnT 111 (1 169979) 


GimU 


2.65 (2) 


0.66 (15) 


Homo sapiens GnT V (1 1 69980) 


GImU / GnT 1 


2.82(2)7 2.52 
(7) 


2.33 (6)/ 2.41 (4) 


E. coG lipid A disaccharide synthase (126464) 


GImU 


2.72 (5) 


0.86 (15) 


A. thaiiana trehalose-6-phosphate synthase 
(1865676) 

Homo sapiens ceramide glucosyttfansferase 


GImU 


2.94 (3) 


1.48(9) 


GImU 


2.81 (1) 


1.08(9) 


(2498228) 








Homo sapiens PIG-B (1552168) 


GImU 


2.64 (3) 


0 .14 (27) 


Sus scrofa N-acetyl-p-D-g)ucosaminide a-l .6> 


GImU 


2.33 (8) 


0.85 (16) 


fucosyttransferase (1752753) 






1-84 (2)/ 1.74 (3) 7 


Drosophila me/anogaster UDP-glucose glycoprotein 


GlmU/GnTI/ 


3.54 (1)/3.05 


gluoosyltiansfecase (790584) 


spsA 


(3) 72.82 (7) 


2.03 (1) 


Sacchammyces cerevisiae Kilter-toxin resistance 


GtmU/spsA 


3.05(2)72.99 


2.23(5)75.39(1} 


protein 5 precursor (2507054) 




(3) 


1.51 (2) 


Haemophilus innuenzae Upooligosaccharide 


SpsA 


2.39 (8) 


biosynthesis protein tex-1 (1170778) 




3.57 (1) 


7,42(1) 


Bacilius subtUis Teichoic add biosynthesis protein A 


GImU 


(135271) 






1.75 (14)/ 3.19 (4)/ 


Homo sapiens polypeptide N- 


GImU / p4Gal- 


3.06(2)72.96 


acetytgalactosaminyltransferase (1709558) 


TI/GnTI/ 


(3)7 2.94(4)/ 


4.34 (2)/ 2.38 (10) 


spsA 


2.48 (12) 





2 
3 



5 
6 



7 

6 

9 

10 

11 

12 

13 

14 

15 

16 
17 
18 

19 

20 

21 

22 
23 

24 



25 
26 
27 



Inverting 

Inverting 
Retaining 

Retaining 



Retaining 
Retaining 



Inverting 

Retaining 

Inverting 

Inverting 

Inverting 

Inverting 

Inverting 

Inverting 

Retaining 

Inverting 
Inverting 
Inverting 

??? 

Retaining 
Retaining 

??? 

Inverting 
Retaining 



Inverting 
7?? 

Retaining 
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WE CLAIM: 



1. A secondary or three-dimensional structure of a purified glycosyltransferase when it associates ^, 
with a nucleotide sugar donor, acceptor, or metal cofactor. 7 

2. A secondary or three-dimensional structure of a purified glycosyltransferase in association with a ^ 
moiety. 

3. A secondary or three-dimensional structure as claimed in claim 2, wherein the moiety is a 
nucleotide sugar donor, acceptor, metal co&ctor, or heavy metal atom. 

4. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims that is a crystalline form. 

5- A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims, wherein the glycosyltransferase is an N-acetylglucosaminyltransferase. 

6. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims having one or both of the following characteristics: 

(a) an N-terminal domain comprising an eight-stranded mixed p-sheet flanked by six helices, 
and a small two-stranded antiparallel p-sheet ; and 

(b) a C-terminal domain comprising a four-stranded mixed P-sheet flanked by three a-helices 
and a short P-finger. 

7. A secondary or three-dimensional structure of a glycosyltransferase as defined in claim 6 fiirther 
characterized by the N-terminal domain and C-terminal domain being connected by a linker 
region which wraps halfway around the N-terminal domain before starting the first helix of the 
C-terminal domain. 

8. A secondary or three-dimensional structure of a glycosyltransferase as defined in any of the 
preceding claims having the structural coordinates of a glycosyltransferase listed in Table 1, 2, 
3, or 4. 

9. A secondary or three-dimensional structure of a glycosyltransferase in association with a sugar 
nucleotide donor having the structural coordinates of a glycosyltransferase and a sugar 
nucleotide donor listed in Table 3. 

10. A secondary or diree-dimensional structure of a glycosyltransferase in association with an 
acceptor having the structural coordinates of a glycosyltransferase and an acceptor listed in Table 
4. 

11. A crystalline form of a glycosyltransferase having a unit cell with dimensions of a = 40.4 ± 3 A, 

b-82.4 ±3 A, and c= 102.5 ±3 A. ^ . 

12. A crystalline form of an N-acetylglucosaminyltransferase having the structural coordinates listed 

in Table 1, 2, 3, or 4, and a unit cell with dimensions of a = 40.4 ± 3 A, b=82.4 ± 3 A, and c = ^ 
102.5 ±3 A. 

13. A crystalline form as claimed in claim 11 or 12 fijrther characterized by the parameters, 
dif&action statistics, and/or refinement statistics in Table 6. 
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14. A secondary or three-dimensional structure of a binding site of a secondary or three-dimensional 
structure of a glycosyltransferase as defined in any of the preceding claims. 

15. A secondary or three-dimensional structure of a binding site as claimed in claim 14 wherein the 
binding site is defined by its association with one or more of a disphosphate group of a sugar 
nucleotide donor, a nucleotide of a sugar nucleotide donor, a sugar of a nucleotide of a sugar 
nucleotide donor, a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, 
and/or an acceptor. 

16. A secondary or three-dimensional structure of a binding site of a glycosyltransferase as defined 
in the preceding claims wherein the binding site is also defined by the atomic interactions of 
Table 5, preferably the enzyme atomic contacts. 

17. A secondary or three-dimensional structure of a binding site of a glycosyltransferase as defined 
in the preceding claims wherein the binding site is defined by atomic interactions 1 to 5; 6 and 7; 
8, 9 and 10; 1 to 13; 14 to 21; 22 to 27; 1 to 13; 1 to 21; or 1 1, 12, 13, and 27 listed in Table 5, or 
the enzyme atomic contacts for these atomic interactions listed in Table 5, 

18. A secondary or three-dimensional structure of an spsA GnT 1 core (SGC) domain of a secondary 
or three-dimensional structure of a glycosyltransferase as defined in any of the preceding claims. 

19. A secondary or three-dimensional structure of an SGC domain as claimed in claim 18 
characterized by an eight-stranded mixed P-sheet, flanked by six helices, and a small two- 
stranded antiparallel p-sheet 

20. A modulator of tiie activity of a glycosyltransferase derived from a secondary or three- 
dimensional structure as claimed in any of the preceding claims. 

21. A method of determining three-dimensional structures of polypeptides with unknown structure 
comprising the step of applying the structural coordinates of Table 1, 2, 3, or 4. 

22. A method for identifying a potential modulator of a glycosyltransferase, or binding sites or 
domains thereof, comprising the step of using the structural coordinates of Table 1, 2, 3, or 4 that 
define a glycosyltransferase or binding sites or domains thereof, to computationally evaluate a 
test compound for its ability to associate with the glycosyltransferase, binding sites or domains 
thereof; wherein a test compound that associates is a potential modulator of a 
glycosyltransferase. 

23. A method for identifymg a modulator of a glycosyltransferase by determining binding 
interactions between a test compound and secondary or three-dimensional structures of binding 
sites as defined in any of the preceding claims comprising: 

(a) generating the binding sites on a computer screen; 

(b) generating a test compoimd with its spatial structure on the computer screen; and 

(c) testing to determine whether the test compound bmds to a selected number of binding 
sites. 

24. A method for identifying a potential modulator of a glycosyltransferase function comprising the 
steps: 
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(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a secondary or three-dimensional structure of a 
glycosyi transferase or a binding site as defined in any of the preceding claims, to obtain 
a complex; 

(b) determining a conformation of the complex with a favourable geometric ilt and 
favourable complementaiy interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of the 
glycosyltransferase. 

25. A method for identifying a potential modulator of a glycosyltransferase function comprising the 
steps: 

(a) modifying a computer representation of a compound complexed with a secondary or three- 
dimensional structure of a glycosyltransferase or a binding site as defined in any of the 
preceding claims, by deleting or adding a chemical group or groups; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying a compound that best fits the binding cavity as a potential modulator of a 
glycosyltransferase. 

26. A method for identifying a potential modulator of a glycosyltransferase function comprising the 
steps: 

(a) selecting a computer representation of a compound complexed with a secondary or three- 
dimensional structure of a glycosyltransferase or a binding site as defined in any of the 
preceding claims; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
computer program, or replacing portions of the compound with similar chemical structures 
from a data base using a compound building computer program. 

27. A modulator of a glycosyltransferase identified by a method as claimed in any of the preceding 
claims. 

28. A method for designing potential inhibitors of a glycosyltransferase comprising the step of using 
the structural coordinates of a sugar nucleotide donor or acceptor or component thereof, defined 
in relation to it spatial association with the three dimensional structure of a glycosyltransferase or 
a binding site as defined in any of the preceding claims, to generate a compoimd that is cabbie 
of associating with the glycosyltransferase or binding cavity thereof. 

29. A modulator of a glycosyltransferase based on a three-dimensional structure of a sugar 
nucleotide donor, an acceptor, or a component thereof, defined in relation to the sugar nucleotide 
donor's or acceptor's spatial association with a secondary or three-dimensional structure of a 
glycosyltransferase or binding site as defmed in the preceding claims. 

30. A pharmaceutical composition comprising a modulator as claimed in any of the preceding claims 
either alone or with other active substances. 
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3 1 . A method of treating a disease associated with a glycosyitransferase with inappropriate activity 
in a cellular organism, comprising: 

(a) administering a pharmaceutical composition as claimed in claim 30; and 

(b) activating or inhibiting a glycosyitransferase to treat the disease. 

32. Use of a modulator identified by the methods of any of the preceding claims in the preparation of 
a medicament to treat a disease associated with a glycosyitransferase with inappropriate activity 
in a cellular organism. 

33. Use of structural coordinates of a glycosyitransferase structure as set out in Table 1, 2, 3, or 4 to 
manufacture a medicament. 

34. Machine readable media encoded with data representing the structural coordinates of a secondary 
or three-dimensional structure of a glycosyitransferase or a binding site as defined in any of the 
preceding claims. 

35. A machine readable media as claimed in claim 34 wherein the data also includes structural 
coordinates for a nucleotide sugar donor, acceptor, metal cofactor, or heavy metal atom. 
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ABSTRACT 

The invention relates to the three dimensional structure of a glycosyltransferase. The 
atomic coordinates that define the structure and any compounds bound to the structure 
enable the determination of the three dimensional structures of glycosyltransferases with 
unknown structure, and the identification of modulators of a glycosyltransferase. 
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Figure 7 
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Figure SB 
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Figure 8B 
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Figure 8F 
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Attorney Docket No. 12243. 23-US- WO 

MERCHANT & GOULD P.C. 
United States Patent Application 
COMBINED DECLARATION AND POWER OF ATTORNEY 

As a below named inventor I hereby declare that: my residence, post office address and citizenship are as stated below next to my 
name; that 

I verily believe I am the original, first and sole inventor (if only one name is listed below) or a joint inventor (if plural inventors 
are named below) of the subject matter which is claimed and for which a patent is sought on the invention entitled: 
GLYCOSYLTRANSFERASES STRUCTURES 



The specification of which 

a. is attached hereto 

b. 13 was filed on 18 December 2001 as application serial no. 10/018869 and was amended on (if applicable) (in the case of a PCT- 
filed application) described and claimed in international no. PCT/CAOO/00725 filed 16 June 2000, and as amended on (if 
any), which I have reviewed and for which I solicit a United States patent. 

I hereby state that I have reviewed and understand the contents of the above-identified specification, including the claims, as amended by 
any amendment referred to above. 

I hereby claim foreign priority benefits under Title 35, United States Code, § 11 9/365 of any foreign application(s) for patent or inventor's 
certificate listed below and have also identified below any foreign application for patent or inventor's certificate having a filing date before 
that of the application on the basis of which priority is claimed: 



a. □ no such applications have been filed. 

b. 1^ such applications have been filed as follows: 



FORI 


CIGN APPLICATION(S), IF ANY, CI 


LAIMING PRIORITY UNDER 35 USC § 11 9 


COUNTRY 


APPLICATION NUMBER 


DATE OF FILING 
(day, month, year) 


DATE OF ISSUE 
(day, month, year) 










ALL FORE 


IGN APPLICATION(S), IF ANY, FILED BEFORE THE PRIORITY APPLIC 


:ATI0N(S) 


COUNTRY 


APPLICATION NUMBER 


DATE OF FILING 
(day, month, year) 


DATE OF ISSUE 
(day, month, year) 











I hereby claim the benefit under Tide 35, United States Code, § 120/365 of any United States and PCT international application(s) listed 
below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application in the 
manner provided by the first paragraph of Title 35, United States Code, § 1 12, 1 acknowledge the duty to disclose material information as 
defined in Title 37, Code of Federal Regulations, § 1.56(a) which occurred between the filing date of the prior application and the national 
or PCT international filing date of this application. 



U.S. APPLICATION NUMBER 


DATE OF FILING (day, month, year) 


STATUS (patented, pending, abandoned) 
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I hereby claim the benefit under Title 35, United States Code § 1 19(e) of any United States provisional application(s) Usted below: 



U.S. PROVISIONAL APPLICATION NUMBER 


DATE OF FILING (Day, Month, Year) 


60/139,949 


18 June 1999 


60/161,809 


27 October 1999 


60/178,401 


27 January 2000 


60/202,509 


5 May 2000 



I acknowledge the duty to disclose information that is material to the patentability of this application in accordance with Title 37, Code of 
Federal Regulations, § 1.56 (reprinted below): 

§ 1.56 Duty to disclose information material to patentability. 

(a) A patent by its very nature is affected with a public interest. The public interest is best served, and the most effective 
patent examination occurs when, at the time an application is being examined, the Office is aware of and evaluates the teachings of all 
information material to patentability. Each individual associated with the filing and prosecution of a patent application has a duty of candor 
and good faith in dealing with the Office, which includes a duty to disclose to the Office all information known to that individual to be 
material to patentability as defined in this section. The duty to disclose information exists with respect to each pending claim until the 
claim is canceled or withdrawn from consideration, or the application becomes abandoned. Information material to the patentability of a 
claim that is canceled or withdrawn from consideration need not be submitted if the information is not material to the patentability of any 
claim remaining under consideration in the application. There is no duty to submit information which js not material to the patentability of 
any existing claim. The duty to disclose all information known to be material to patentability is deemed to be satisfied if all information 
known to be material to patentability of any claim issued in a patent was cited by the Office or submitted to the Office in the marmer 
prescribed by §§ 1.97(b)-(d) and 1.98. However, no patent will be granted on an application in connection with which fraud on the Office 
was practiced or attempted or the duty of disclosure was violated through bad faith or intentional misconduct. The Office encourages 
applicants to carefully examine: 

(1) prior art cited in search reports of a foreign patent office in a counterpart application, and 

(2) the closest information over which individuals associated with the filing or prosecution of a patent application 
beUeve any pending claim patentably defines, to make sure that any material information contained therein is disclosed to the Office. 

(b) Under this section, information is material to patentability when it is not cumulative to information already of record or 
being made of record in the application, and 

(1) It establishes, by itself or in combination with other information, a prima facie case of unpatentability of a claim; 

or 

(2) It refutes, or is inconsistent with, a position the applicant takes in: 

(i) Opposing an argument of unpatentability relied on by the Office, or 

(ii) Asserting an argument of patentability. 

A prima facie case of unpatentability is established when the information compels a conclusion that a claim is unpatentable under the 
preponderance of evidence, burden-of-proof standard, giving each term in the claim its broadest reasonable construction consistent with the 
specification, and before any consideration is given to evidence which may be submitted in an attempt to establish a contrary conclusion of 
patentability. 

(c) Individuals associated with the filing or prosecution of a patent application within the meaning of this section are: 

(1) Each inventor named in the application: 

(2) Each attorney or agent who prepares or prosecutes the application; and 

(3) Every other person who is substantively involved in the preparation or prosecution of the apphcation and who is 
associated with the inventor, with the assignee or with anyone to whom there is an obligation to assign the application. 

(d) Individuals other than the attorney, agent or inventor may comply with this section by disclosing information to the 
attorney, agent, or inventor. 

(e) In any continuation-in-part application, the duty under this section includes the duty to disclose to the Office all 
information known to the person to be material to patentability, as defined in paragraph (b) of this section, which became available between 
the filing date of the prior application and the national or PCT international filing date of the continuation-in-part application. 
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I hereby appoint the following attorney(s) and/or patent agent(s) to prosecute this application and to transact all business in the Patent an 



Trademark Office connected 



herewith: 



Albrecht, John W. . 
Ah, M. Jeffer 
Altera, Allan G. 
Anderson, Gregg I. 
Batzli, Brian H. 
Beard, John L. 
Bems, John M. 
Branch, John W. 
Brown, Jeffrey C. 
Bruess, Steven C. 
Byrne, Linda M. 
Campbell, Keith 
Carlson, Alan G. 
Caspers, Philip P. 
Clifford, John A, 
Cook, Jeffrey 
Daignault, Ronald A. 
Daley, Dennis R. 
Daulton, Julie R. 
DeVries Smith, Katherine M. 
DiPietro, Mark J. 
Doscotch, Matthew A, 
Edell, Robert T. 
Epp Ryan, Sandra 
Glance, Robert J. 
Goff, Jared S. 
Goggin, Matthew J. 
GoUa, Charles E. 
Gorman, Alan G. 
Gould, John D. 
Gregson, Richard 
Gresens, John J. 
Hamer, Samuel A. 
Hamre, Curtis B, 
Harrison, Kevin C. 
Hertzberg, Brett A. 
Hillson, Randall A. 
Holzer, Jr., Richard J. 
Hope, Leonard J. 
Jardine, John S. 
Johns, Nicholas P. 
Johnston, Scott W. 
Kadievitch, Natalie D. 
Kaseburg, Frederick A. 
Kettelberger, Denise 
Keys, Jeramie J. 
Knearl, Homer L. 
Kowalchyk, Alan W. 
Kowalchyk, Katherine M. 
Lacy, Paul E. 
Larson, James A. 



Reg. No. 40,481 
Reg. No. 46,359 
Reg. No. 40,274 
Reg. No. 28,828 
Reg. No. 32,960 
Reg. No. 27,612 
Reg. No. 43,496 
Reg. No. 41,633 
Reg. No. 41,643 
Reg. No. 34,130 
Reg. No. 32,404 
Reg. No. 46,597 
Reg. No. 25,959 
Reg. No. 33,227 
Reg. No. 30,247 
Reg. No. 48,649 
Reg. No. 25,968 
Reg. No. 34,994 
Reg. No. 36,414 
Reg. No. 42,157 
Reg. No. 28,707 
Reg No. P-48,957 
Reg. No. 20,187 
Reg. No. 39,667 
Reg. No. 40,620 
Reg. No. 44,716 
Reg. No. 44,125 
Reg. No. 26,896 
Reg. No. 38,472 
Reg. No. 18,223 
Reg. No. 41,804 
Reg. No. 33,112 
Reg. No. 46,754 
Reg. No. 29,165 
Reg. No. 46,759 
Reg. No. 42,660 
Reg. No. 31,838 
Reg. No. 42,668 
Reg. No, 44,774 
Reg. No. P-48,835 
Reg. No. 48,995 
Reg. No. 39,721 
Reg. No. 34,196 
Reg. No. 47,695 
Reg. No. 33,924 
Reg. No. 42,724 
Reg. No. 21,197 
Reg. No. 31,535 
Reg. No. 36,848 
Reg. No. 38,946 
Reg. No. 40,443 



Leonard, Christopher J. 
Liepa, Mara E. 
Lindquist, Timothy A. 
Lown, Jean A. 
Mayfleld, Denise L. 
McDonald, Daniel W. 
Mclntyre, Jr., William F. 
Mitchem, M. Todd 
Mueller, Douglas P. 
Nelson, Anna M. 
Paley, Kenneth B. 
Parsons, Nancy J. 
Pauly, Daniel M. 
Phillips, JohnB. 
Pino, Mark J. 
Prendergast, Paul 
Pytel, Melissa J. 
Qualey, Terry 
Reich, John C. 
Reiland, Earl D. 
Samuels, Lisa A. 
Schmaltz, David G. 
Schuman, Mark D. 
Schumaim, Michael D. 
Scull, Timothy B. 
Sebald, Gregory A. 
Skoog, Mark T. 
Spellman, Steven J. 
Stewart, Alan R. 
Stoll-DeBell, Kirstin L. 
Sullivan, Timothy 
Sumner, John P. 
Swenson, Erik G. 
Tellekson, David K. 
Trembath, Jon R. 
Tunlieim, Marcia A. 
Underbill, Albert L. 
Vandenburgh, J. Derek 
Wahl, JohnR. 
Weaver, Paul L. 
Welter, Paul A. 
Whipps, Brian 
Whitaker, John E. 
Wier, David D. 
Williams, Douglas J. 
Withers, James D. 
Witt, Jonelle 
Wong, Thomas S. 
Wu, Tong 
Young, Thomas 
Zeuli, Anthony R. 



Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No, 
Reg. No. 
Reg. No. 
Reg. No- 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No, 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 



41,940 

40,066 

40,701 

48,428 

33,732 

32,044 

44,921 

40,731 

30,300 

48,935 

38,989 

40,364 

40,123 

37,206 

43,858 

46,068 

41,512 

25,148 

37,703 

25,767 

43,080 

39,828 

31,197 

30,422 

42,137 

33,280 

40,178 

45,124 

47,974 

43,164 

47,981 

29,114 

45,147 

32,314 

38,344 

42,189 

27,403 

32,179 

33,044 

48,640 

20,890 

43,261 

42,222 

P-48,229 

27,054 

40,376 

41,980 

48,577 

43,361 

25,796 

45,255 



I hereby authorize them to act and rely on instructions from and communicate directly with the person/assignee/attomey/firm/ organization 
who/which first sends/sent this case to them and by whom/which I hereby declare that I have consented after full disclosure to be 
represented unless/until I instruct Merchant & Gould P.C. to the contrary. 



I understand that the execution of this document, and the grant of a power of attomey, does not in itself establish an attorney-client 
relationship between the undersigned and the law firm Merchant & Gould P.C, or any of its attorneys. 
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Please direct all correspondence in this case to Merchant & Gould P.C. at the address^in^i^ated below: 

j Merchant & Gould P.C. 
J P.O. Box 2903 

/ Minneapolis, MN 55402-0903 

1 hereby declare that all statements made herein of my Own knowledge are true and that all statemeIrts-madejiai»fofmation and belief are 
believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so made are 
punishable by fme or imprisonment, or both, under Section 1001 of Title 1 8 of the United States Code and that such willful false statements 
may jeopardize the validity of the application or any patent issued thereon. 
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Full Name 
Of Inventor 


Family Name 
RINI 




First Given Name 

__James 


Second Given Name 














0 


Residence 
& Citizenship 


City 

North York 




State or Foreign Country 
Canada 


Country of Citizenship 

Canada 


1 


Mailing 
Address 


Address 

7 Bevdale Rd, 


City 

North York 


State & Zip Code/Country 

Ontario M2N 2G3 / Canada 


Signature of Inventor 201 : ^ 
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Full Name 
Of Inventor 


Family Name / 

UNLIGIL _/ 


First Given Name 
Ulug 


Second Given Name 

M. 












0 


Residence 

& Citizenship 


City / 
filniiccslgr 




State or Foreign Country 

Canada 


Country of Citizenship 

Canada 


2 


Mailing 
Address 


Address 

5 1 Valewood Crescent 


City 

Gloucester 




State & Zip Code/Country 
Ontario KIB 4G1 / Canada 



1) 
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Sequence Listing 
SEQIDNOl 

WVEDDLEV APDFFEY FQATYPLLKADSL 
SEQ ID NO 2 

VWEDDLEVAPDFFEYFRATYPLLKADPSL 
SEQ ID NO 3 

VVVEDDLEVAPDFFEYFQATYPLLRTDPSL 
SEQ ID NO 4 

IITEDDLDIAPDFFSYFSNTRYLLEKDPSL 
SEQ ID NO. 5 

IVTEDDLDIGNDFFSYFRWGKQVLNSDDTl 
SEQ ID NO 6 

RHYRWALGQIFHNFNYPAAVVVEDDLEVAPDFKAFWDDWMRRPEQRKGRACVRPEI 
SEQ ID NO? 

TRYAALINQAIEMAEGEYITYATDDNIYMPDRYRIGDARFFWRVNHFYPFYPLDE 
SEQ ID NO 8 

KLLNVGFKEALKDYDYNCFVFSDVDLIPMNDHWGGEDDOnTsIRLAFRGMSVSRPNA 
SEQ ID NO 9 

LGTGHAMQQAAPFF ADDEDILMLYGD VPL! S VETGEY Yll Dl 1 ALA YQEGREI V AVHP 
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SHEET fRULE 26> 
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